Jed Rembold
July 9, 2025


Should you want to play around more with Neo4j locally, Docker Compose makes it simple:
services:
neo4j:
image: neo4j:latest
volumes:
- ./logs:/logs
- ./config:/config
- ./data:/data
- ./plugins:/plugins
environment:
- NEO4J_AUTH=neo4j/yourpassword
ports:
- "7474:7474"
- "7687:7687"
restart: always7474 is the port of the web
interface
7687 is the API port, and
what should be used if you are trying to connect other clients to the
database:Person
[:FRIENDS_WITH],
[:SIBLING_OF], or
[:ACTED_IN]
Suppose we wanted to create objects of these types
We need to use the declarative language Cypher
Initially just focused on creation
CREATE
(a:Person {name: "Jed", age: 40, loc: "Salem"}),
(b:Person {name: "Luke", age: 38, loc: "Abq"}),
(a)-[:BROTHER_OF {since: 1987, mother: "Ginger", father: "Rick"}]->(b)( ) are used to indicate nodes-[ ]-> are used to indicate
relationships{ },
with keys and values separated by :|||expr|||:|||label||| pattern sets
a node variable equal to the |||expr|||
which can be used elsewhere in the expression
(|||some node|||) -[|||relation|||]-> (|||other node|||)
clausesMATCH and
RETURNThe most basic structure of a query would look like:
MATCH
(n:Person)-[:BROTHER_OF]->(:Person)
RETURN
nThis would return a list of all node objects that match the given pattern
You can return as many things as you want, just separate them with commas
You can access specific properties by using dot notation:
RETURN n.name, n.age for instanceThe information you return can be from as many different nodes or relations as you want, provided you gave them a variable name to refer to them by
MATCH (a:Person)-[:BROTHER_OF]->(b:Person)
RETURN a.name, b.nameInclude the require property in the node definition
MATCH (a:Person {loc: "Salem"})-[:BROTHER_OF]->(b:Person)
RETURN a.name, b.nameOr you can use a WHERE statement
MATCH (a:Person)-[:BROTHER_OF]->(b:Person)
WHERE a.loc = "Salem"
RETURN a.name, b.nameYou do not need to specify a label or a relation type
Query patterns can be more than just a single relation, but entire chains
MATCH
(a:Actor)-[:ACTED_IN]->(m:Movie)
(ca)-[r:ACTED_IN]->(m)
WHERE a.name = "Christian Bale"
RETURN
ca.name, r.roles, m.nameelementId(x) will get you the unique
identifier of x, where
x could be a node or relationtype(r) will get you the type associated
with a relationlabels(n) will get you the label(s)
associated with any given nodeIf you have a longer chain that you want to match, you can use
the repetition operator * in your relation
statement
For example,
MATCH (a)-[:KNOWS*]->(b)
would match where node a and b were connected through any number of nodes that “knew” each other
Frequently, you’d want to further restrict this with a number:
MATCH (a)-[:FRIENDS_WITH*2]->(b)
to get friends of friends for example
You can also query for a range of hops:
MATCH (a)-[:KNOWS*1..6]->(b)
would match anything between 1 and 6 steps away
If what you are mainly interested is the entire matching path, you can also assign a variable to that
MATCH p = (a)-[:KNOWS*1..6]->(b)
RETURN pneo4j (shocking)apache-airflow-providers-neo4j
library to your DockerfileThis is almost identical to Mongo
Import the hook:
from airflow.providers.neo4j.hooks.neo4j import Neo4jHookCreate the hook in your task
hook = Neo4jHook(neo4j_conn_id=|||conn name|||)You can then run queries directly with
results = hook.run(|||Cypher query|||)
Neo4jHook is
utilizing the neo4j Python library.run() method of
Neo44jHook should largely make it
unnecessary.from neo4j import GraphDatabase
driver = GraphDatabase.driver(
"bolt://hostname:7687",
auth=(|||user|||, |||password|||)
)
with driver.session() as s:
results = s.run(|||Cypher query|||)
hook.run, just convert it to JSON with
json.dumps and then write it to your S3
bucketnetworkx library,
you might consider a node-link format, as shown to the right{
"nodes": [
{
"id": "1",
"label": "Node A"
},
{
"id": "2",
"label": "Node B"
},
{
"id": "3",
"label": "Node C"
}
],
"edges": [
{
"source": "1",
"target": "2",
"label": "Edge 1-2"
},
{
"source": "2",
"target": "3",
"label": "Edge 2-3"
}
]
}