Linkage Checker's OOM in beam-sdks-java-io-hcatalog
See original GitHub issueCC: @iemejia
Unfortunately the previous enhancement #1239 could not fix the beam-sdks-java-io-hcatalog problem: https://gist.github.com/suztomo/b175357b2fb7d03d69204dc1f35c4d20
suztomo@suxtomo24:~/beam6$ ./gradlew -Ppublishing -PjavaLinkageArtifactIds=beam-sdks-java-io-hcatalog :checkJavaLinkage
...
> Task :checkJavaLinkage
...
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3181)
at java.util.ArrayList.grow(ArrayList.java:265)
at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239)
at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231)
at java.util.ArrayList.add(ArrayList.java:462)
at com.google.cloud.tools.opensource.dependencies.DependencyGraph.addPath(DependencyGraph.java:68)
at com.google.cloud.tools.opensource.dependencies.DependencyGraphBuilder.levelOrder(DependencyGraphBuilder.java:333)
at com.google.cloud.tools.opensource.dependencies.DependencyGraphBuilder.buildDependencyGraph(DependencyGraphBuilder.java:236)
at com.google.cloud.tools.opensource.dependencies.DependencyGraphBuilder.buildFullDependencyGraph(DependencyGraphBuilder.java:202)
at com.google.cloud.tools.opensource.classpath.ClassPathBuilder.resolve(ClassPathBuilder.java:69)
at com.google.cloud.tools.opensource.classpath.LinkageCheckerMain.main(LinkageCheckerMain.java:73)
> Task :checkJavaLinkage FAILED
...
BUILD FAILED in 48m 29s
903 actionable tasks: 160 executed, 743 up-to-date
(It did show great enhancement in beam-sdks-java-extensions-sql-zetasql. comment)
Todo
Investigate why beam-sdks-java-io-hcatalog is so heavy.
- Test case: 99852024
- How many nodes in the graph (levelOrder’s argument)? More than 3,000,000. After 2,000,000, it becomes very slow. It had 80,000,000 nodes in the tree.
- Any node that has unexpectedly many children? (say 100)
Opportunity to optimize the reseult.
Issue Analytics
- State:
- Created 4 years ago
- Comments:46 (41 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Thanks @suztomo for keeping this in mind. In any case the other fix is a great improvement so hopefully this case helps to find more refinements I will follow this issue. Thanks for working on these improvements 👍
Thanks @suztomo for the explanation it makes sense to align with Maven’s behavior.