Good news for Cypher users. In the previous version of Cypher, a performance problem came from the fact that the IN clause didn't use indexes.
Consider the following query, elaborated from a similar query in Chapter 4 of the book:
MATCH(n:User) WHERE n.email IN {emailQuery} RETURN n.userId
Profiling the query in Neo4j 2.0 you get the following plan:
"plan": {
"name": "ColumnFilter",
"rows": 1,
"dbHits": 0,
"children": [
{
"name": "Extract",
"rows": 1,
"dbHits": 1,
"children": [
{
"name": "Filter",
"args": {
"pred": "any(-_-INNER-_- in {emailQuery} where Property(n,email(5)) == -_-INNER-_-)",
"_rows": 1,
"_db_hits": 1002
},
"rows": 1,
"dbHits": 1002,
"children": [
{
"name": "NodeByLabel",
"rows": 1002,
"dbHits": 0,
"children": []
}
]
}
]
}
]
}
Using Neo4j 2.1.2, instead, you get the following plan:
"plan": {
"name": "ColumnFilter",
"args": {
"ColumnsLeft": "keep columns n.userId",
"Rows": "Rows(1)",
"DbHits": "DbHits(0)"
},
"rows": 1,
"dbHits": 0,
"children": [
{
"name": "Extract",
"rows": 1,
"dbHits": 2,
"children": [
{
"name": "SchemaIndex",
"args": {
"DbHits": "DbHits(2)",
"Rows": "Rows(1)",
"LegacyExpression": "{emailQuery}",
"IntroducedIdentifier": "IntroducedIdentifier(n)",
"Index": ":User(email)"
},
"rows": 1,
"dbHits": 2,
"children": []
}
]
},
{
"name": "SchemaIndex",
"args": {
"DbHits": "DbHits(2)",
"Rows": "Rows(1)",
"LegacyExpression": "{emailQuery}",
"IntroducedIdentifier": "IntroducedIdentifier(n)",
"Index": ":User(email)"
},
"rows": 1,
"dbHits": 2,
"children": []
}
]
}
Clearly, the number of DB hits is much lower.
No comments:
Post a Comment