question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Expression optimizer should be able to disable constant folding when the returned value is huge

See original GitHub issue

We recently encountered a coordinator reliability issue. The query contain an expression like the following:

          CASE
              WHEN x <= 10000 THEN SEQUENCE(1, COALESCE(x, 1))
              WHEN x <= 20000 THEN FLATTEN(ARRAY[
                  SEQUENCE(1, 10000),
                  SEQUENCE(10001, x)
              ])
              WHEN x <= 30000 THEN FLATTEN(ARRAY[
                  SEQUENCE(1, 10000),
                  SEQUENCE(10001, 20000),
                  SEQUENCE(20001, x)
              ])
              WHEN x <= 40000 THEN FLATTEN(ARRAY[
                  SEQUENCE(1, 10000),
                  SEQUENCE(10001, 20000),
                  SEQUENCE(20001, 30000),
                  SEQUENCE(30001, x)
              ])
              WHEN x <= 50000 THEN FLATTEN(ARRAY[
                  SEQUENCE(1, 10000),
                  SEQUENCE(10001, 20000),
                  SEQUENCE(20001, 30000),
                  SEQUENCE(30001, 40000),
                  SEQUENCE(40001, x)
              ])
... more WHEN

Presto will optimize constant expression such as SEQUENCE(1, 10000) into a “magic literal”, which is basically the serialized bytes of evaluated result.

However, if the evaluated constant value is huge (e.g. in this case, there are many arrays of size 10K), it will make the plan huge.

https://github.com/prestodb/presto/issues/8964 further amplifies the issue and we should definitely fix that. However, I also think there should be some kind of guard when Presto stops to do constant folding to result in a reasonable sized plan.

cc @highker, @rongrong , @hellium01

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
kaikalurcommented, Sep 17, 2019

Yeah we could cap array literal size explicitly to be small like 100 or so and suggest the user use SEQUENCE if it’s bigger. I would think the optimization opportunities for such large literals are rare enough that it may be ok to not do it all the time.

1reaction
hellium01commented, Sep 17, 2019

There might be cases constant folding might be useful later. For example, if we call apply/reduce function on top of an expanded sequence, the actual result will be small. There are some optimizers can utilize the fact that a field can be constant. Though, it looks like very rare SEQUENCE will be used in such cases, so simply disable it should be a safe bet.

We currently convert array constant into RowExpression other than “magic literal”, so it is serialized as a block, which will be much smaller than before. Another way is to change the serialization form of array to be more concise (which will benefit data exchange as well).

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to turn off the constant folding optimization in llvm
Hi, I am new to clang and llvm. I'm trying to generate an unoptimized version of bit code from a c source code....
Read more >
Programming Assignment 6: Register Allocation
In this programming assignment, you will complete your Iota + compilers by implementing graph-coloring register allocation and constant folding.
Read more >
Kaleidoscope: Adding JIT and Optimizer Support
If so, it could just do the constant fold and return the constant instead of creating an instruction. This is exactly what the...
Read more >
4. Kaleidoscope: Adding JIT and Optimizer Support — LLVM 7 ...
Chapter 4 Introduction; Trivial Constant Folding ... If they define a function, they should be able to call it from the command line....
Read more >
tvm.relay.transform — tvm 0.11.dev0 documentation
For example, when recasting from float to integer, many small values will simply be set to 0. ... Fold the constant expressions in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found