Good news for Cypher users. In the previous version of Cypher, a performance problem came from the fact that the IN clause didn't use indexes.
Consider the following query, elaborated from a similar query in Chapter 4 of the book:
MATCH(n:User) WHERE n.email IN {emailQuery} RETURN n.userId
Profiling the query in Neo4j 2.0 you get the following plan:
  "plan": {
    "name": "ColumnFilter",
    "rows": 1,
    "dbHits": 0,
    "children": [
      {
        "name": "Extract",
        "rows": 1,
        "dbHits": 1,
        "children": [
          {
            "name": "Filter",
            "args": {
              "pred": "any(-_-INNER-_- in {emailQuery} where Property(n,email(5)) == -_-INNER-_-)",
              "_rows": 1,
              "_db_hits": 1002
            },
            "rows": 1,
            "dbHits": 1002,
            "children": [
              {
                "name": "NodeByLabel",
                "rows": 1002,
                "dbHits": 0,
                "children": []
              }
            ]
          }
        ]
      }
    ]
  }
Using Neo4j 2.1.2, instead, you get the following plan:
  "plan": {
    "name": "ColumnFilter",
    "args": {
      "ColumnsLeft": "keep columns n.userId",
      "Rows": "Rows(1)",
      "DbHits": "DbHits(0)"
    },
    "rows": 1,
    "dbHits": 0,
    "children": [
      {
        "name": "Extract",
        "rows": 1,
        "dbHits": 2,
        "children": [
          {
            "name": "SchemaIndex",
            "args": {
              "DbHits": "DbHits(2)",
              "Rows": "Rows(1)",
              "LegacyExpression": "{emailQuery}",
              "IntroducedIdentifier": "IntroducedIdentifier(n)",
              "Index": ":User(email)"
            },
            "rows": 1,
            "dbHits": 2,
            "children": []
          }
        ]
      },
      {
        "name": "SchemaIndex",
        "args": {
          "DbHits": "DbHits(2)",
          "Rows": "Rows(1)",
          "LegacyExpression": "{emailQuery}",
          "IntroducedIdentifier": "IntroducedIdentifier(n)",
          "Index": ":User(email)"
        },
        "rows": 1,
        "dbHits": 2,
        "children": []
      }
    ]
  }
Clearly, the number of DB hits is much lower.
No comments:
Post a Comment