question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to filter out paths containing self edges using a cypher Match query.

See original GitHub issue

Trying the following query to get a path using a match query to filter out paths that contain self edges (node with edges to itself to avoid increased length of path:

match p = (base:party{node_id:'245793137123'})<-[trail:ownership*..]-(leaves:party) where none( rel in relationships(p) where startNode(rel) = endNode(rel) ) return count(p)

But this query is unable to filter out the data and I get the entire set.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
dwitrycommented, Aug 27, 2019

.neq(' cypher.null') is guard to handle cypher.null which represents null in Cypher for Gremlin.

We may call “root cause” the following snippet:

.repeat(__
            .inE('ownership')
            .as('trail')
            .aggregate('  cypher.path.edge.p')
            .outV()
    )
.emit()
.until(__
        .path()
        .from('base')
        .count(local)
        .is(gte(21))
    )

This is how variable length path <-[trail:ownership*..]- is translated. Basically, it is an imperative way to traverse relationships (21 is limit to avoid infinite loops). If there is a loop, the path will be traversed repeatedly (what happens in the query you’ve provided).

So we need to find a better implementation of this in Gremlin, but as I’ve said in the previous message this is a rather complicated task, as there are other edge cases that need to be considered, as well as performance.

0reactions
pushkarnagpalcommented, Aug 27, 2019

Variable length path in loops is a complex case, which is tricky to implement in Gremlin to fully cover all the edge cases. I’ve invested a lot of time trying to properly implement it (1, 2) but the solution still not universal. Unfortunately can not provide any estimates when this will be improved.

Ahh okay. I was working on a fraud detection case.and got stuck here. I saw this query conversion from the cypher in the gremlin-server.log file, and I see a .neq(' cypher.null') in the query and i assume this step is occuring at the none where clause. I think this is the root cause of the bug.

g.V().as('base')
	.hasLabel('party').has('node_id', eq('245793137123'))
	.repeat(__
				.inE('ownership')
				.as('trail')
				.aggregate('  cypher.path.edge.p')
				.outV()
		)
	.emit()
	.until(__
			.path()
			.from('base')
			.count(local)
			.is(gte(21))
		)
	.as('leaves')
	.hasLabel('party')
	.path()
	.from('base')
	.as('p')
	.where(__
			.not(__.V().hasLabel('party').outE('ownership').inV().as('  GENERATED4').where(__.select('  GENERATED4').where(eq('leaves'))))
		)
	.optional(__
		.select(all, 'trail').as('trail')
		)
	.select('p')
	.is(neq('  cypher.null'))
	.count()
	.project('count(p)')
	.by(__.identity())

Hope I could help somehow with this. Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Neo4J/Cypher: How to filter the nodes of a path?
Is there a way to filter the nodes between the relationships in the first query? Important notes:
Read more >
Performing pattern negation to multiple nodes - Neo4j
Correct approach: collect nodes to exclude, and use WHERE NONE() on the collection to drive exclusion ; Ingredient) WHERE ; as excluded MATCH...
Read more >
Avoid cycles in Cypher queries | GraphAware
The reason is that when Neo4j traverses the graph with your Cypher query, it uses every edge only once (to prevent endless loops), ......
Read more >
Neo4j Cheat Sheet & Quick Reference
A Neo4j cheat sheet with getting started resources and information on how to query the database with ... Matches relationships with the declared...
Read more >
Cypher Query Language Reference, Version 9 - Amazon S3
A Cypher query part cannot both match and update the graph at the same time. ... terms as a directed, vertex-labeled, edge-labeled multigraph...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found