[CEP] CommCare Query Language for Case Search
See original GitHub issueAbstract
Currently we use ‘xpath’ for parsing query expressions to generate Elasticsearch queries for case search. In some places we even call it Xpath (_xpath_query
).
This CEP proposes that we break away from the XPath syntax to allow us to develop a more expressive query language that more closely aligns with how the queries are evaluated.
Motivation
Although we are parsing the queries as XPath we are not trying to make them behave like XPath. One prime example of this is our syntax for querying ancestor cases:
parent/parent/name = 'joe'
In XPath that would indicate a traversal down the hierarchy and not up.
We also only support expressions that result in a ‘binary expressions’ that evaluate to a boolean.
Further more, we do support XPath on Mobile / Web Apps so the more we make this query language appear to be XPath the more likely there is to be confusion about what is supported in one or the other.
Specification
The current base spec would match our existing capabilities with the addition of set operations as a mechanism of querying related cases.
An example parser that support the current query features and set queries has been made here.
Set Query
Why do we need them?
-
You can’t filter case based on their subcases
-
Filtering by ancestors is restrictive and has poor performance characgeristics.
You can query ancestors using the
parent/parent/name = 'bob'
syntax it is very restrictive and only allows filtering on a single property of the ancestor at a time. Filtering on multiple properties requires doing this:parent/name = 'michaela' and parent/is_active = 1
The result of this is multiple join queries which is very inefficient and non-performant.
Set query syntax
Set queries allow creating a filtered set of cases and then by applying modifiers to the set we can join it with the parent query and use it for filtering.
Filter on subcases
{age > 3 and active = 1}.subcase('parent').exists()
Filtering on supercases (parent, host, ancestor)
{age > 3 and active = 1}.supercase('parent').exists()
Nesting
{age > 3 and active = 1 and {@case_type = 'lab' and result=1}.subcase('parent').exists() }.subcase('parent').exists()
Alternatives
These alternatives make use of existing XPath syntax and so can still be parse using the Xpath parser. The main argument against these is it pushes us further int ‘XPath’ territory and blurs the lines between in-app and case search xpath support.
Function
subcase_exists('parent', age > 3 and active = 1)
# nesting
parent_exists('parent', age > 3 and active = 1 and parent_exists('parent', result = 1))
Predicates
subcase_exists[identifier='parent'][age > 3 and active = 1]
# alternative for ancestor queries which is an extension of the current syntax
parent[age > 3 and active = 1]/parent[name='joe']
See https://github.com/dimagi/commcare-hq/issues/31118 for more details on the ancestor predicate version.
Impact on users
Give users more expressive queries that perform better.
Impact on hosting
None
Backwards compatibility
Existing queries will continue to work
Open questions and issues
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (12 by maintainers)
Top GitHub Comments
No objections here.
Re: nesting syntax error
That makes a lot more sense. Thanks for fixing.
I know Set Query syntax has been ruled out, but for clarity on the proposed syntax, is there also a missing
and
syntax error in the Nesting expression?See also https://github.com/dimagi/commcare-hq/pull/31187