question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failure evaluating expressions for NOT + STARTS_WITH predicate

See original GitHub issue
scala> val df = spark.read.format("iceberg").load("/tmp/iceberg/logs")
df: org.apache.spark.sql.DataFrame = [level: string]

scala> df.filter(not($"level".startsWith("b"))).show()
java.lang.IllegalArgumentException: No negation for operation: STARTS_WITH
  at org.apache.iceberg.expressions.Expression$Operation.negate(Expression.java:72)
  at org.apache.iceberg.expressions.UnboundPredicate.negate(UnboundPredicate.java:69)
  at org.apache.iceberg.expressions.RewriteNot.not(RewriteNot.java:44)
  at org.apache.iceberg.expressions.RewriteNot.not(RewriteNot.java:22)
  at org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:293)
  at org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.project(Projections.java:153

There are multiple places in Iceberg which call RewriteNot on the user provided filters (1, 2, 3 and many more). However, the STARTS_WITH predicate does not support negation and hence RewriteNot throws an exception. Should we implement a NOT_STARTS_WITH operation to support this usecase?

Another approach would be to just keep NOT expression in RewriteNot for predicates that do not support negation. However some evaluators in Iceberg make the assumption that there are no NOTs in the tree and it may affect their correctness. https://github.com/apache/iceberg/blob/425a45f8acec0496d77e070c07fb209de92ab2c1/api/src/main/java/org/apache/iceberg/expressions/Projections.java#L148

cc: @rdblue @aokolnychyi

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
rdbluecommented, Dec 31, 2020

I’m not. Go for it!

0reactions
cccs-ericcommented, Dec 1, 2021

I ran into the same issue while trying to run a SQL query using Spark against an Iceberg table. It crashed with the same error: IllegalArgumentException: No negation for operation: STARTS_WITH. I ran the following query in a Jupyter Notebook:

%%sparksql --view bug --cache
SELECT
    *
FROM
    namespace.my_table AS tbl
WHERE
    tbl.aDate > CURRENT_DATE() - INTERVAL 10 DAYS
    AND tbl.activityDisplayName = 'Update user'
    AND rawJSON LIKE '%SourceAnchor%'
    AND displayName NOT LIKE 'Sync%'

If I create a temporary table in Spark using one or some of the table data files and issue the same query, it works. So it really is when Iceberg is in play.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Entity Framework, LINQ: can't use Any with StartsWith?
ToList(); A possible work-around is to build a predicate with || (Or) predicates, for example: var pred = letters. Aggregate(PredicateBuilder.
Read more >
LIKE predicate - IBM Documentation
The description of pattern-expression provides a detailed explanation on how the pattern is matched to evaluate the predicate to true or false.
Read more >
Overview of Predicates (SQL) - InterSystems Documentation
Describes logical conditions that evaluate to either true or false. Use of Predicates. A predicate is a condition expression that evaluates to a...
Read more >
Predicate expressions - Splunk Documentation
One of the expressions must evaluate to TRUE. The expressions cannot be equal to one another. The NOT operator only applies to the...
Read more >
LIKE conditions - Sybase Infocenter
expression [ NOT ] LIKE pattern [ ESCAPE escape-expr ] ... LIKE predicates that start with characters other than wildcard characters may execute...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found