Excessive classes are generated during lambda execution
See original GitHub issueThe simplest way I found to the problem is to have a table with a lot of nested arrays (e.g. ARRAY(ARRAY(INTEGER))
:
CREATE TABLE test_100k_rows AS
SELECT id as rid
FROM
(
select sequence(1, 100000) AS seq
) tmp
CROSS JOIN UNNEST(seq) AS t(id)
CREATE TABLE test_2lev_array AS
SELECT transform(sequence(1, 1000), x->ARRAY[x]) AS arr
FROM test_100k_rows
This will generate an table with 100k rows (I think 10k rows are also enough), each row contains an ARRAY<ARRAY<INTEGER>>
, and each array contains 1000 entries.
The following query will cause the problem:
SELECT count(filter(arr, x->false))
FROM test_2lev_array
You will then see about 100k LambdaForm classes generated in the form LambdaForm$xxxx.class
, which are all (almost) same, here is one example of the disassemble code:
Compiled from "LambdaForm$BMH55095"
final class java.lang.invoke.LambdaForm$BMH55095 {
static java.lang.Object reinvoke_55095(java.lang.Object, java.lang.Object);
Code:
0: ldc #12 // String CONSTANT_PLACEHOLDER_0 <<(Block)Boolean : BMH.reinvoke_029=Lambda(a0:L/SpeciesData<LLL>,a1:L)=>{\n t2:L=BoundMethodHandle$Species_L3.argL2(a0:L);\n t3:L=BoundMethodHandle$Species_L3.argL1(a0:L);\n t4:L=BoundMethodHandle$Species_L3.argL0(a0:L);\n t5:L=MethodHandle.invokeBasic(t4:L,t3:L,t2:L,a1:L);t5:L}\n& BMH=[MethodHandle(PageProjection_81,ConnectorSession,Block)Boolean, com.facebook.presto.$gen.PageProjection_81{projection=filter(#0, (expr) -> false)}, FullConnectorSession{queryId=20170501_213554_00000_ujd4a, user=wxie, timeZoneKey=America/Los_Angeles, locale=en_US, startTime=1493674554081}]>>
2: checkcast #14 // class java/lang/invoke/MethodHandle
5: astore_0
6: aload_0
7: checkcast #16 // class java/lang/invoke/BoundMethodHandle$Species_L3
10: dup
11: astore_0
12: getfield #20 // Field java/lang/invoke/BoundMethodHandle$Species_L3.argL2:Ljava/lang/Object;
15: astore_2
16: aload_0
17: getfield #23 // Field java/lang/invoke/BoundMethodHandle$Species_L3.argL1:Ljava/lang/Object;
20: astore_3
21: aload_0
22: getfield #26 // Field java/lang/invoke/BoundMethodHandle$Species_L3.argL0:Ljava/lang/Object;
25: astore 4
27: aload 4
29: checkcast #14 // class java/lang/invoke/MethodHandle
32: aload_3
33: aload_2
34: aload_1
35: invokevirtual #30 // Method java/lang/invoke/MethodHandle.invokeBasic:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
38: areturn
static void dummy();
Code:
0: ldc #34 // String BMH.reinvoke_55095=Lambda(a0:L/SpeciesData<LLL>,a1:L)=>{\n t2:L=BoundMethodHandle$Species_L3.argL2(a0:L);\n t3:L=BoundMethodHandle$Species_L3.argL1(a0:L);\n t4:L=BoundMethodHandle$Species_L3.argL0(a0:L);\n t5:L=MethodHandle.invokeBasic(t4:L,t3:L,t2:L,a1:L);t5:L}
2: pop
3: return
}
Note the problem doesn’t happen for flat arrays (e.g. ARRAY<INTEGER>
). It also doesn’t happen for nested array but each array only contains <= 50 entries. There seems to be some weird behavior with HotSpot JVM.
Pull Request for Capture Support: https://github.com/prestodb/presto/pull/7210
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (6 by maintainers)
It seems like this is related to the
java.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD
inMethodHandleStatics
(this is a JVM arg with a default value of 127, and if you set this one in your test program you will see it affects the output). I think we need to better understand how method handle customization works.Some reading:
mlvm-dev mailing list about expected outcome from lambda form customization : http://mail.openjdk.java.net/pipermail/mlvm-dev/2017-May/006755.html
mlvm-dev mailing list about performance of
invokeinterface
(functional object by lambda metafactory) vs.invokeBasic
(MethodHandle customization): http://mail.openjdk.java.net/pipermail/mlvm-dev/2018-February/006829.htmlLambda: A peek under the hood by Brian Goetz: http://chariotsolutions.com/wp-content/uploads/presentation/2014/04/Brian-Goetz-Lambda-Under-The-Hood.pdf
State of the Lmabda: https://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-final.html