Expression optimizer should be able to disable constant folding when the returned value is huge
See original GitHub issueWe recently encountered a coordinator reliability issue. The query contain an expression like the following:
CASE
WHEN x <= 10000 THEN SEQUENCE(1, COALESCE(x, 1))
WHEN x <= 20000 THEN FLATTEN(ARRAY[
SEQUENCE(1, 10000),
SEQUENCE(10001, x)
])
WHEN x <= 30000 THEN FLATTEN(ARRAY[
SEQUENCE(1, 10000),
SEQUENCE(10001, 20000),
SEQUENCE(20001, x)
])
WHEN x <= 40000 THEN FLATTEN(ARRAY[
SEQUENCE(1, 10000),
SEQUENCE(10001, 20000),
SEQUENCE(20001, 30000),
SEQUENCE(30001, x)
])
WHEN x <= 50000 THEN FLATTEN(ARRAY[
SEQUENCE(1, 10000),
SEQUENCE(10001, 20000),
SEQUENCE(20001, 30000),
SEQUENCE(30001, 40000),
SEQUENCE(40001, x)
])
... more WHEN
Presto will optimize constant expression such as SEQUENCE(1, 10000)
into a “magic literal”, which is basically the serialized bytes of evaluated result.
However, if the evaluated constant value is huge (e.g. in this case, there are many arrays of size 10K), it will make the plan huge.
https://github.com/prestodb/presto/issues/8964 further amplifies the issue and we should definitely fix that. However, I also think there should be some kind of guard when Presto stops to do constant folding to result in a reasonable sized plan.
cc @highker, @rongrong , @hellium01
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (8 by maintainers)
Top Results From Across the Web
How to turn off the constant folding optimization in llvm
Hi, I am new to clang and llvm. I'm trying to generate an unoptimized version of bit code from a c source code....
Read more >Programming Assignment 6: Register Allocation
In this programming assignment, you will complete your Iota + compilers by implementing graph-coloring register allocation and constant folding.
Read more >Kaleidoscope: Adding JIT and Optimizer Support
If so, it could just do the constant fold and return the constant instead of creating an instruction. This is exactly what the...
Read more >4. Kaleidoscope: Adding JIT and Optimizer Support — LLVM 7 ...
Chapter 4 Introduction; Trivial Constant Folding ... If they define a function, they should be able to call it from the command line....
Read more >tvm.relay.transform — tvm 0.11.dev0 documentation
For example, when recasting from float to integer, many small values will simply be set to 0. ... Fold the constant expressions in...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yeah we could cap array literal size explicitly to be small like 100 or so and suggest the user use SEQUENCE if it’s bigger. I would think the optimization opportunities for such large literals are rare enough that it may be ok to not do it all the time.
There might be cases constant folding might be useful later. For example, if we call apply/reduce function on top of an expanded sequence, the actual result will be small. There are some optimizers can utilize the fact that a field can be constant. Though, it looks like very rare
SEQUENCE
will be used in such cases, so simply disable it should be a safe bet.We currently convert array constant into
RowExpression
other than “magic literal”, so it is serialized as a block, which will be much smaller than before. Another way is to change the serialization form of array to be more concise (which will benefit data exchange as well).