Dependency Conflicts with Spark 2.3.0
See original GitHub issueSpark 2.3.0 had following dependency version conflicts and caused troubles when I was trying to run ETL job:
parquet:
- Spark 2.3.0: 1.8.2
- Hudi 0.4.1: 1.8.1 Seeing errors like following: `2018-04-16 07:19:11 INFO TaskSetManager:54 - Lost task 19.8 in stage 6.0 (TID 2253) on 10.28.21.230, executor 19: java.lang.NoSuchMethodError (org.apache.parquet.column.ParquetProperties.<init>(ILorg/apache/parquet/column/ParquetProperties$WriterVersion;Z)V) [duplicate 202]
io.netty.netty-all:
- Spark 2.3.0: 4.1.17.Final
- Hudi 0.4.1: 4.0.23.Final
Wondering what typically is the best practice to deal with dependency conflicts for hudi vs spark
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (9 by maintainers)
Top Results From Across the Web
Dependency Conflicts with Spark 2.3.0 #381 - apache/hudi
Spark 2.3.0 had following dependency version conflicts and caused troubles when I was trying to run ETL job: parquet: Spark 2.3.0: 1.8.2 ...
Read more >Building Spark - Spark 2.3.0 Documentation
Packaging without Hadoop Dependencies for YARN. The assembly directory produced by mvn package will, by default, include all of Spark's dependencies, including ...
Read more >Jar dependencies error using Spark 2.3 structured streaming
The only kafka-related jar I used is spark-sql-kafka-0-10_2.11-2.3.0.jar as the doc said. java ...
Read more >Manage Java and Scala dependencies for Apache Spark
This conflict can arise because Hadoop injects its dependencies into the application's classpath, so its dependencies take precedence over the application's ...
Read more >Solved: Re: remote pyspark shell and spark-submit error ja...
Hi Yeah, it seems some jar conflicts somewhere. ... _wrapped) File "/var/lib/airflow/spark/spark-2.3.0-bin-without-hadoop/python/lib/py4j-0.10.6-src.zip/ ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@leletan : I was able to get the tests succeeded with spark-2.3.0 Can you try the commit below and see whether if works for you: https://github.com/bvaradar/hudi/commit/e189734a07b8782ea1d21b3c780dfc61c2ab8f2b
Regarding your question, Supporting spark-2.x versions (one option is by using mvn profiles) is definitely part of our plan.
@AaronCH5 I’m not aware of what redisson is and how your are using Hudi with it. If there are version clashes, as you mentioned, you can build different versions of it with the required packages or use mvn’s exclude feature to exclude 2 different versions of the jar being packaged.