Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
See original GitHub issueIs your enhancement request related to a problem? Please describe.
Currently the Pulsar IO .nar files are large in size. The total size of Pulsar IO files is 1952MB! Break down: https://gist.github.com/lhotari/810a543524e25457b521ac666913ad3c
Describe the solution you’d like
Exclude all Pulsar Functions Worker dependencies from Pulsar IO .nar files .
For example,
$ unzip -l ~/.m2/repository/org/apache/pulsar/pulsar-io-data-generator/2.8.0-SNAPSHOT/pulsar-io-data-generator-2.8.0-SNAPSHOT.nar |grep META-INF/bundled-dependencies | sort -k 4,4
0 02-12-2021 07:04 META-INF/bundled-dependencies/
183117 02-12-2021 07:04 META-INF/bundled-dependencies/aircompressor-0.16.jar
4467 02-12-2021 07:04 META-INF/bundled-dependencies/aopalliance-1.0.jar
449146 02-12-2021 07:04 META-INF/bundled-dependencies/async-http-client-2.12.1.jar
9909 02-12-2021 07:04 META-INF/bundled-dependencies/async-http-client-netty-utils-2.12.1.jar
566992 02-12-2021 07:04 META-INF/bundled-dependencies/avro-1.9.1.jar
25683 02-12-2021 07:04 META-INF/bundled-dependencies/avro-protobuf-1.9.1.jar
887800 02-12-2021 07:04 META-INF/bundled-dependencies/bcpkix-jdk15on-1.68.jar
6031548 02-12-2021 07:04 META-INF/bundled-dependencies/bcprov-ext-jdk15on-1.68.jar
5961178 02-12-2021 07:04 META-INF/bundled-dependencies/bcprov-jdk15on-1.68.jar
146056 02-12-2021 07:04 META-INF/bundled-dependencies/bookkeeper-common-4.12.1.jar
16852 02-12-2021 07:04 META-INF/bundled-dependencies/bookkeeper-common-allocator-4.12.1.jar
19351 02-12-2021 07:04 META-INF/bundled-dependencies/bookkeeper-stats-api-4.12.1.jar
11082557 02-12-2021 07:04 META-INF/bundled-dependencies/bouncy-castle-bc-2.8.0-SNAPSHOT-pkg.jar
214381 02-12-2021 07:04 META-INF/bundled-dependencies/checker-qual-3.5.0.jar
65366 02-12-2021 07:04 META-INF/bundled-dependencies/circe-checksum-4.12.1.jar
284184 02-12-2021 07:04 META-INF/bundled-dependencies/commons-codec-1.10.jar
615064 02-12-2021 07:04 META-INF/bundled-dependencies/commons-compress-1.19.jar
362679 02-12-2021 07:04 META-INF/bundled-dependencies/commons-configuration-1.10.jar
208700 02-12-2021 07:04 META-INF/bundled-dependencies/commons-io-2.5.jar
284220 02-12-2021 07:04 META-INF/bundled-dependencies/commons-lang-2.6.jar
494856 02-12-2021 07:04 META-INF/bundled-dependencies/commons-lang3-3.6.jar
61829 02-12-2021 07:04 META-INF/bundled-dependencies/commons-logging-1.2.jar
2213560 02-12-2021 07:04 META-INF/bundled-dependencies/commons-math3-3.6.1.jar
23508 02-12-2021 07:04 META-INF/bundled-dependencies/cpu-affinity-4.12.1.jar
13879 02-12-2021 07:04 META-INF/bundled-dependencies/error_prone_annotations-2.3.4.jar
4617 02-12-2021 07:04 META-INF/bundled-dependencies/failureaccess-1.0.1.jar
240255 02-12-2021 07:04 META-INF/bundled-dependencies/gson-2.8.6.jar
2862361 02-12-2021 07:04 META-INF/bundled-dependencies/guava-30.1-jre.jar
674028 02-12-2021 07:04 META-INF/bundled-dependencies/guice-4.1.0.jar
42873 02-12-2021 07:04 META-INF/bundled-dependencies/guice-assistedinject-4.1.0.jar
45012 02-12-2021 07:04 META-INF/bundled-dependencies/iban4j-3.2.1.jar
8781 02-12-2021 07:04 META-INF/bundled-dependencies/j2objc-annotations-1.3.jar
68167 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-annotations-2.11.1.jar
351575 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-core-2.11.1.jar
1419800 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-databind-2.11.1.jar
46983 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-dataformat-yaml-2.11.1.jar
79295 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-module-jsonSchema-2.11.1.jar
780265 02-12-2021 07:04 META-INF/bundled-dependencies/javassist-3.25.0-GA.jar
78030 02-12-2021 07:04 META-INF/bundled-dependencies/javax.activation-1.2.0.jar
2497 02-12-2021 07:04 META-INF/bundled-dependencies/javax.inject-1.jar
127509 02-12-2021 07:04 META-INF/bundled-dependencies/javax.ws.rs-api-2.1.jar
2254 02-12-2021 07:04 META-INF/bundled-dependencies/jcip-annotations-1.0.jar
252020 02-12-2021 07:04 META-INF/bundled-dependencies/jctools-core-2.1.2.jar
566323 02-12-2021 07:04 META-INF/bundled-dependencies/jetty-util-9.4.35.v20201120.jar
273528 02-12-2021 07:04 META-INF/bundled-dependencies/jfairy-0.5.9.jar
640724 02-12-2021 07:04 META-INF/bundled-dependencies/joda-time-2.10.1.jar
19936 02-12-2021 07:04 META-INF/bundled-dependencies/jsr305-3.0.2.jar
2199 02-12-2021 07:04 META-INF/bundled-dependencies/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
24995 02-12-2021 07:04 META-INF/bundled-dependencies/memory-0.8.3.jar
289921 02-12-2021 07:04 META-INF/bundled-dependencies/netty-buffer-4.1.51.Final.jar
320174 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-4.1.51.Final.jar
61345 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-dns-4.1.51.Final.jar
36193 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-haproxy-4.1.51.Final.jar
617948 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-http-4.1.51.Final.jar
625057 02-12-2021 07:04 META-INF/bundled-dependencies/netty-common-4.1.51.Final.jar
456702 02-12-2021 07:04 META-INF/bundled-dependencies/netty-handler-4.1.51.Final.jar
21842 02-12-2021 07:04 META-INF/bundled-dependencies/netty-reactive-streams-2.0.4.jar
33158 02-12-2021 07:04 META-INF/bundled-dependencies/netty-resolver-4.1.51.Final.jar
151765 02-12-2021 07:04 META-INF/bundled-dependencies/netty-resolver-dns-4.1.51.Final.jar
4017922 02-12-2021 07:04 META-INF/bundled-dependencies/netty-tcnative-boringssl-static-2.0.33.Final.jar
473222 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-4.1.51.Final.jar
152317 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar
33062 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final.jar
56446 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final-linux-x86_64.jar
1660960 02-12-2021 07:04 META-INF/bundled-dependencies/protobuf-java-3.11.4.jar
73874 02-12-2021 07:04 META-INF/bundled-dependencies/protobuf-java-util-3.11.4.jar
47021 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-client-admin-api-2.8.0-SNAPSHOT.jar
141344 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-client-api-2.8.0-SNAPSHOT.jar
657161 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-client-original-2.8.0-SNAPSHOT.jar
877274 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-common-2.8.0-SNAPSHOT.jar
38477 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-config-validation-2.8.0-SNAPSHOT.jar
21681 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-functions-api-2.8.0-SNAPSHOT.jar
23202 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-io-core-2.8.0-SNAPSHOT.jar
28200 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-package-core-2.8.0-SNAPSHOT.jar
9037 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-transaction-common-2.8.0-SNAPSHOT.jar
11369 02-12-2021 07:04 META-INF/bundled-dependencies/reactive-streams-1.0.3.jar
130999 02-12-2021 07:04 META-INF/bundled-dependencies/reflections-0.9.11.jar
421509 02-12-2021 07:04 META-INF/bundled-dependencies/sketches-core-0.8.3.jar
41203 02-12-2021 07:04 META-INF/bundled-dependencies/slf4j-api-1.7.25.jar
284338 02-12-2021 07:04 META-INF/bundled-dependencies/snakeyaml-1.18.jar
21782 02-12-2021 07:04 META-INF/bundled-dependencies/swagger-annotations-1.6.2.jar
63777 02-12-2021 07:04 META-INF/bundled-dependencies/validation-api-1.1.0.Final.jar
pulsar-io-data-generator has a single unique dependency which is jfairy. This means that about 45MB of the dependencies are redundant in each pulsar-io .nar file.
These files won’t get used at all for classloading. It is safe to remove all dependencies that are part of Pulsar Functions Worker’s system classloader. The reason for this is that classloaders use parent-first lookups (by default, and also in Pulsar Functions Worker).
Additional context
Reducing the size of Pulsar IO .nar files would help reducing the pulsar-all Docker image size too. There will be benefits in the Pulsar (core) build, although PIP-62 covers moving Pulsar IO connectors from apache/pulsar repository to apache/pulsar-connectors .
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:7 (7 by maintainers)
Top GitHub Comments
@lhotari thanks for your detailed description. @sijie sure, i will create a pr to fix this issue.
Closed as stale and it seems resolved. I check the latest data generator nar is in size 11M.
Please open a new issue if it’s still relevant to the maintained versions.