question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dependency issues when using --packages option with spark

See original GitHub issue

I encounter an issue when using the packages option with spark shell. Any idea why is this happening? im using spark 1.6.1 on amazon EMR emr-4.7.1

**spark-shell --packages com.databricks:spark-redshift_2.10:0.6.0**
Ivy Default Cache set to: /home/hadoop/.ivy2/cache
The jars for the packages stored in: /home/hadoop/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/lib/spark-assembly-1.6.1-hadoop2.7.2-amzn-2.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-redshift_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
    confs: [default]
    found com.databricks#spark-redshift_2.10;0.6.0 in central
    found org.slf4j#slf4j-api;1.7.5 in local-m2-cache
    found com.databricks#spark-avro_2.10;2.0.1 in central
    found org.apache.avro#avro;1.7.6 in central
    found org.codehaus.jackson#jackson-core-asl;1.9.13 in local-m2-cache
    found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in local-m2-cache
    found com.thoughtworks.paranamer#paranamer;2.3 in local-m2-cache
    found org.xerial.snappy#snappy-java;1.0.5 in local-m2-cache
    found org.apache.commons#commons-compress;1.4.1 in local-m2-cache
    found org.tukaani#xz;1.0 in local-m2-cache
downloading https://repo1.maven.org/maven2/com/databricks/spark-redshift_2.10/0.6.0/spark-redshift_2.10-0.6.0.jar ...
    [SUCCESSFUL ] com.databricks#spark-redshift_2.10;0.6.0!spark-redshift_2.10.jar (27ms)
downloading https://repo1.maven.org/maven2/com/databricks/spark-avro_2.10/2.0.1/spark-avro_2.10-2.0.1.jar ...
    [SUCCESSFUL ] com.databricks#spark-avro_2.10;2.0.1!spark-avro_2.10.jar (13ms)
downloading https://repo1.maven.org/maven2/org/apache/avro/avro/1.7.6/avro-1.7.6.jar ...
    [SUCCESSFUL ] org.apache.avro#avro;1.7.6!avro.jar(bundle) (22ms)
downloading file:/home/hadoop/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar ...
    [SUCCESSFUL ] org.apache.commons#commons-compress;1.4.1!commons-compress.jar (2ms)
downloading file:/home/hadoop/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar ...
    [SUCCESSFUL ] org.tukaani#xz;1.0!xz.jar (2ms)
:: resolution report :: resolve 2496ms :: artifacts dl 88ms
    :: modules in use:
    com.databricks#spark-avro_2.10;2.0.1 from central in [default]
    com.databricks#spark-redshift_2.10;0.6.0 from central in [default]
    com.thoughtworks.paranamer#paranamer;2.3 from local-m2-cache in [default]
    org.apache.avro#avro;1.7.6 from central in [default]
    org.apache.commons#commons-compress;1.4.1 from local-m2-cache in [default]
    org.codehaus.jackson#jackson-core-asl;1.9.13 from local-m2-cache in [default]
    org.codehaus.jackson#jackson-mapper-asl;1.9.13 from local-m2-cache in [default]
    org.slf4j#slf4j-api;1.7.5 from local-m2-cache in [default]
    org.tukaani#xz;1.0 from local-m2-cache in [default]
    org.xerial.snappy#snappy-java;1.0.5 from local-m2-cache in [default]
    :: evicted modules:
    org.slf4j#slf4j-api;1.6.4 by [org.slf4j#slf4j-api;1.7.5] in [default]
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |   11  |   10  |   10  |   1   ||   10  |   5   |
    ---------------------------------------------------------------------

:: problems summary ::
:::: WARNINGS
        [NOT FOUND  ] org.slf4j#slf4j-api;1.7.5!slf4j-api.jar (0ms)

    ==== local-m2-cache: tried

      file:/home/hadoop/.m2/repository/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar

        [NOT FOUND  ] org.codehaus.jackson#jackson-core-asl;1.9.13!jackson-core-asl.jar (0ms)

    ==== local-m2-cache: tried

      file:/home/hadoop/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar

        [NOT FOUND  ] org.codehaus.jackson#jackson-mapper-asl;1.9.13!jackson-mapper-asl.jar (0ms)

    ==== local-m2-cache: tried

      file:/home/hadoop/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar

        [NOT FOUND  ] com.thoughtworks.paranamer#paranamer;2.3!paranamer.jar (0ms)

    ==== local-m2-cache: tried

      file:/home/hadoop/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar

        [NOT FOUND  ] org.xerial.snappy#snappy-java;1.0.5!snappy-java.jar(bundle) (0ms)

    ==== local-m2-cache: tried

      file:/home/hadoop/.m2/repository/org/xerial/snappy/snappy-java/1.0.5/snappy-java-1.0.5.jar

        ::::::::::::::::::::::::::::::::::::::::::::::

        ::              FAILED DOWNLOADS            ::

        :: ^ see resolution messages for details  ^ ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: org.slf4j#slf4j-api;1.7.5!slf4j-api.jar

        :: org.codehaus.jackson#jackson-core-asl;1.9.13!jackson-core-asl.jar

        :: org.codehaus.jackson#jackson-mapper-asl;1.9.13!jackson-mapper-asl.jar

        :: com.thoughtworks.paranamer#paranamer;2.3!paranamer.jar

        :: org.xerial.snappy#snappy-java;1.0.5!snappy-java.jar(bundle)

        ::::::::::::::::::::::::::::::::::::::::::::::



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [download failed: org.slf4j#slf4j-api;1.7.5!slf4j-api.jar, download failed: org.codehaus.jackson#jackson-core-asl;1.9.13!jackson-core-asl.jar, download failed: org.codehaus.jackson#jackson-mapper-asl;1.9.13!jackson-mapper-asl.jar, download failed: com.thoughtworks.paranamer#paranamer;2.3!paranamer.jar, download failed: org.xerial.snappy#snappy-java;1.0.5!snappy-java.jar(bundle)]
    at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1068)
    at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:287)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

58reactions
DerekHanqingWangcommented, Nov 27, 2017

The problem has nothing related with spark or ivy itself. It’s essentially maven repo issue. When you specify a 3rd party lib in --packages, ivy will first check local ivy repo and local maven repo for the lib as well as all its dependencies. If found, it won’t try to download it from central repo. However, when searching the local maven repo, ivy will only check if the directory of artifact exists without checking if there is actually jar file in the dir.

found com.thoughtworks.paranamer#paranamer;2.3 in local-m2-cache

This msg indicates that directory of paranamer-2.3.jar was found in local maven repo. But if you go to the directory, you will find no jar file there. I think it’s because maven tried to download the artifact from central before but failed to get the jar for some reason.

A solution is to remove related dir in .ivy2/cache, ivy2/jars and .m2/repository/

4reactions
dfdeshomcommented, Aug 15, 2016

Ran into the same issue. In my case, I deleted my $HOME/.ivy2 directory and ran ./bin/spark-shell --packages com.databricks:spark-redshift_2.10:2.0.0 again to get rid of the issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resolving dependency problems in Apache Spark
Apache Spark's classpath is built dynamically (to accommodate per-application user code) which makes it vulnerable to such issues.
Read more >
How to Manage Python Dependencies in PySpark - Databricks
Apache Spark™ provides several standard ways to manage dependencies across the nodes in a cluster via script options such as --jars , --packages...
Read more >
Managing dependencies and artifacts in PySpark
In this blog entry, we'll examine how to solve these problems by following a good practice of using 'setup.py' as your dependency management...
Read more >
Best Practices for Dependency Problem in Spark - Gankrin
Resolve Dependency Problem in Spark . While building any Spark Application – this is one of the main concerns that any Engineer should...
Read more >
Exception when using spark.jars.packages - Apache
When more than one process is using packages option it's possible to create ... INFO - datastax#spark-cassandra-connector added as a dependency INFO ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found