question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dependency Conflicts with Spark 2.3.0

See original GitHub issue

Spark 2.3.0 had following dependency version conflicts and caused troubles when I was trying to run ETL job:

parquet:

  • Spark 2.3.0: 1.8.2
  • Hudi 0.4.1: 1.8.1 Seeing errors like following: `2018-04-16 07:19:11 INFO TaskSetManager:54 - Lost task 19.8 in stage 6.0 (TID 2253) on 10.28.21.230, executor 19: java.lang.NoSuchMethodError (org.apache.parquet.column.ParquetProperties.<init>(ILorg/apache/parquet/column/ParquetProperties$WriterVersion;Z)V) [duplicate 202]

io.netty.netty-all:

  • Spark 2.3.0: 4.1.17.Final
  • Hudi 0.4.1: 4.0.23.Final

Wondering what typically is the best practice to deal with dependency conflicts for hudi vs spark

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
bvaradarcommented, Apr 25, 2018

@leletan : I was able to get the tests succeeded with spark-2.3.0 Can you try the commit below and see whether if works for you: https://github.com/bvaradar/hudi/commit/e189734a07b8782ea1d21b3c780dfc61c2ab8f2b

Regarding your question, Supporting spark-2.x versions (one option is by using mvn profiles) is definitely part of our plan.

0reactions
n3nashcommented, Mar 14, 2019

@AaronCH5 I’m not aware of what redisson is and how your are using Hudi with it. If there are version clashes, as you mentioned, you can build different versions of it with the required packages or use mvn’s exclude feature to exclude 2 different versions of the jar being packaged.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dependency Conflicts with Spark 2.3.0 #381 - apache/hudi
Spark 2.3.0 had following dependency version conflicts and caused troubles when I was trying to run ETL job: parquet: Spark 2.3.0: 1.8.2 ...
Read more >
Building Spark - Spark 2.3.0 Documentation
Packaging without Hadoop Dependencies for YARN. The assembly directory produced by mvn package will, by default, include all of Spark's dependencies, including ...
Read more >
Jar dependencies error using Spark 2.3 structured streaming
The only kafka-related jar I used is spark-sql-kafka-0-10_2.11-2.3.0.jar as the doc said. java ...
Read more >
Manage Java and Scala dependencies for Apache Spark
This conflict can arise because Hadoop injects its dependencies into the application's classpath, so its dependencies take precedence over the application's ...
Read more >
Solved: Re: remote pyspark shell and spark-submit error ja...
Hi Yeah, it seems some jar conflicts somewhere. ... _wrapped) File "/var/lib/airflow/spark/spark-2.3.0-bin-without-hadoop/python/lib/py4j-0.10.6-src.zip/ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found