question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Package from spPublishLocal not usable due to scala version appearing in the ivy module name

See original GitHub issue

Up until now I’ve used sbt assembly, but now I’m trying to work with spPublishLocal for packaging before performing (automated) integration tests (e.g. on travis) for http://spark-packages.org/package/TargetHolding/pyspark-cassandra.

I’ve build pyspark-cassandra with:

sbt spPublishLocal

When I run pyspark with

PYSPARK_DRIVER_PYTHON=ipython \
path/to/spark-1.5.2-bin-hadoop2.6/bin/pyspark \
--conf spark.cassandra.connection.host=localhost \
--driver-memory 2g \
--master local[*] \
--packages TargetHolding/pyspark-cassandra:0.3.0

I get:

Python 2.7.10 (default, Sep 24 2015, 17:50:09) 
Type "copyright", "credits" or "license" for more information.

IPython 2.4.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
Ivy Default Cache set to: /home/frens-jan/.ivy2/cache
The jars for the packages stored in: /home/frens-jan/.ivy2/jars
:: loading settings :: url = jar:file:/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
TargetHolding#pyspark-cassandra added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
    confs: [default]
:: resolution report :: resolve 794ms :: artifacts dl 0ms
    :: modules in use:
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |   1   |   0   |   0   |   0   ||   0   |   0   |
    ---------------------------------------------------------------------

:: problems summary ::
:::: WARNINGS
        ::::::::::::::::::::::::::::::::::::::::::::::

        ::          UNRESOLVED DEPENDENCIES         ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: TargetHolding#pyspark-cassandra;0.3.0: java.text.ParseException: inconsistent module descriptor file found in '/home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml': bad module name: expected='pyspark-cassandra' found='pyspark-cassandra_2.10'; 

        ::::::::::::::::::::::::::::::::::::::::::::::


:::: ERRORS
        local-ivy-cache: bad module name found in /home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml: expected='pyspark-cassandra found='pyspark-cassandra_2.10'


:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: TargetHolding#pyspark-cassandra;0.3.0: java.text.ParseException: inconsistent module descriptor file found in '/home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml': bad module name: expected='pyspark-cassandra' found='pyspark-cassandra_2.10'; ]
    at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1011)
    at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:286)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/shell.py in <module>()
     41     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     42 
---> 43 sc = SparkContext(pyFiles=add_files)
     44 atexit.register(lambda: sc.stop())
     45 

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    108         """
    109         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 110         SparkContext._ensure_initialized(self, gateway=gateway)
    111         try:
    112             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
    232         with SparkContext._lock:
    233             if not SparkContext._gateway:
--> 234                 SparkContext._gateway = gateway or launch_gateway()
    235                 SparkContext._jvm = SparkContext._gateway.jvm
    236 

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/java_gateway.pyc in launch_gateway()
     92                 callback_socket.close()
     93         if gateway_port is None:
---> 94             raise Exception("Java gateway process exited before sending the driver its port number")
     95 
     96         # In Windows, ensure the Java child processes do not linger after Python has exited.

Exception: Java gateway process exited before sending the driver its port number

In [1]: 

Any ideas what I am doing wrong?

Issue Analytics

  • State:open
  • Created 8 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
touchdowncommented, Apr 19, 2018

ran into a similar issue, added the following for https://github.com/databricks/spark-deep-learning:

organization := "databricks"

name := "spark-deep-learning"

spName := organization.value + "/" + name.value

projectID := {ModuleID(organization.value, name.value, s"${version.value}-s_$scalaMajorVersion")}
0reactions
frensjancommented, Nov 17, 2017

nope, sorry

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues · databricks/sbt-spark-package - GitHub
Package from spPublishLocal not usable due to scala version appearing in the ivy module name bug. #17 opened on Jan 15, 2016 by...
Read more >
databricks - Bountysource
Created 1 year ago in databricks/sbt-spark-package with 1 comments. ... Package from spPublishLocal not usable due to scala version appearing in the ivy...
Read more >
Cannot Create IntelliJ Scala Project Due to Missing Ivy Module
I have tried re-installing IntelliJ, adding the apache ivy jars from Apache (and importing them in the build.sbt!), re-packaging, re-assembling, ...
Read more >
dependency | Apache Ivy™ Documentation
A dependency is described by the module on which the current module depends (identified by its name, organisation and revision), and a mapping...
Read more >
Ivy Publish Plugin - Gradle User Manual
What is published is one or more artifacts created by the build, and an Ivy module descriptor (normally ivy.xml ) that describes the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found