Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Package from spPublishLocal not usable due to scala version appearing in the ivy module name

See original GitHub issue

Up until now I’ve used sbt assembly, but now I’m trying to work with spPublishLocal for packaging before performing (automated) integration tests (e.g. on travis) for http://spark-packages.org/package/TargetHolding/pyspark-cassandra.

I’ve build pyspark-cassandra with:

sbt spPublishLocal

When I run pyspark with

PYSPARK_DRIVER_PYTHON=ipython \
path/to/spark-1.5.2-bin-hadoop2.6/bin/pyspark \
--conf spark.cassandra.connection.host=localhost \
--driver-memory 2g \
--master local[*] \
--packages TargetHolding/pyspark-cassandra:0.3.0

I get:

Python 2.7.10 (default, Sep 24 2015, 17:50:09) 
Type "copyright", "credits" or "license" for more information.

IPython 2.4.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
Ivy Default Cache set to: /home/frens-jan/.ivy2/cache
The jars for the packages stored in: /home/frens-jan/.ivy2/jars
:: loading settings :: url = jar:file:/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
TargetHolding#pyspark-cassandra added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
    confs: [default]
:: resolution report :: resolve 794ms :: artifacts dl 0ms
    :: modules in use:
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |   1   |   0   |   0   |   0   ||   0   |   0   |
    ---------------------------------------------------------------------

:: problems summary ::
:::: WARNINGS
        ::::::::::::::::::::::::::::::::::::::::::::::

        ::          UNRESOLVED DEPENDENCIES         ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: TargetHolding#pyspark-cassandra;0.3.0: java.text.ParseException: inconsistent module descriptor file found in '/home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml': bad module name: expected='pyspark-cassandra' found='pyspark-cassandra_2.10'; 

        ::::::::::::::::::::::::::::::::::::::::::::::


:::: ERRORS
        local-ivy-cache: bad module name found in /home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml: expected='pyspark-cassandra found='pyspark-cassandra_2.10'


:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: TargetHolding#pyspark-cassandra;0.3.0: java.text.ParseException: inconsistent module descriptor file found in '/home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml': bad module name: expected='pyspark-cassandra' found='pyspark-cassandra_2.10'; ]
    at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1011)
    at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:286)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/shell.py in <module>()
     41     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     42 
---> 43 sc = SparkContext(pyFiles=add_files)
     44 atexit.register(lambda: sc.stop())
     45 

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    108         """
    109         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 110         SparkContext._ensure_initialized(self, gateway=gateway)
    111         try:
    112             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
    232         with SparkContext._lock:
    233             if not SparkContext._gateway:
--> 234                 SparkContext._gateway = gateway or launch_gateway()
    235                 SparkContext._jvm = SparkContext._gateway.jvm
    236 

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/java_gateway.pyc in launch_gateway()
     92                 callback_socket.close()
     93         if gateway_port is None:
---> 94             raise Exception("Java gateway process exited before sending the driver its port number")
     95 
     96         # In Windows, ensure the Java child processes do not linger after Python has exited.

Exception: Java gateway process exited before sending the driver its port number

In [1]:

Any ideas what I am doing wrong?

Issue Analytics

State:
Created 8 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

touchdowncommented, Apr 19, 2018

ran into a similar issue, added the following for https://github.com/databricks/spark-deep-learning:

organization := "databricks"

name := "spark-deep-learning"

spName := organization.value + "/" + name.value

projectID := {ModuleID(organization.value, name.value, s"${version.value}-s_$scalaMajorVersion")}

0reactions

frensjancommented, Nov 17, 2017

nope, sorry

Top Results From Across the Web

Issues · databricks/sbt-spark-package - GitHub

Package from spPublishLocal not usable due to scala version appearing in the ivy module name bug. #17 opened on Jan 15, 2016 by...

databricks - Bountysource

Created 1 year ago in databricks/sbt-spark-package with 1 comments. ... Package from spPublishLocal not usable due to scala version appearing in the ivy...

Cannot Create IntelliJ Scala Project Due to Missing Ivy Module

I have tried re-installing IntelliJ, adding the apache ivy jars from Apache (and importing them in the build.sbt!), re-packaging, re-assembling, ...

dependency | Apache Ivy™ Documentation

A dependency is described by the module on which the current module depends (identified by its name, organisation and revision), and a mapping...

Ivy Publish Plugin - Gradle User Manual

What is published is one or more artifacts created by the build, and an Ivy module descriptor (normally ivy.xml ) that describes the...