question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Classpath conflicts between multiple instances of ScalaInterpreters

See original GitHub issue

I have multiple instances of ScalaInterpreters in my custom kernel. I’m using SparkSession inside and for each interpeter i use the following startup code:

import org.apache.spark.sql._
val sparkSession = {NotebookSparkSession.builder()
.master("loca[1]").getOrCreate()}

I start my kernel with option specificLoader=false to share same instance of sparkContext and everything works fine, but from time to time (same code might work or throw an exception) I see serialization errors from spark and i believe it is related to classpath loader.

java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance of org.apache.spark.rdd.MapPartitionsRDD

If I start my kernel with specificLoader=true I don’t see this error at all, but every interpreter creates a new instance of sparkContext, which I want to avoid.

My test code is the following:

val sc = sparkSession.sparkContext
val rdd = sc.parallelize(1 to 100, 10)
val n = rdd.map(_ + 1).sum()

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
alexarchambaultcommented, Oct 22, 2019

How do you make a custom package in Ammonite ?

I don’t have a definite answer, but at the very end, the package is set in generated code around here. By navigating your way in the Ammonite code from there, it may be possible to pass a custom package prefix to ammonite.interp.Interpreter so that it ends up being used there.

Alternatively, the cmd prefix in the class name is defined here. Just allowing to customize it via a field passed to ammonite.interp.Interpreter (and set it yourself to "cmd1_", "cmd2_", etc. so that the classes are named "cmd1_1", "cmd1_2", …) may work as well, and could be more straightforward.

The Ammonite README has some doc detailing manual commands to quickly test that kind of changes.

1reaction
alexarchambaultcommented, Oct 21, 2019

I’m not sure I understand what you’re trying to achieve: you have a custom kernel, using ScalaInterpreter, and you’re trying to have it act like multiple kernels at once, all sharing the same SparkSession, right?

If that’s the case, one issue I see is that each instance of ScalaInterpreter will set spark.repl.class.uri in the spark conf. This setting correspond to the URI of a small web server that ammonite-spark launches, to serve the byte code of the classes generated during the session. I’m not sure which one spark will retain at the end, but only one of them will likely be accessible. Parts of the logic of ammonite-spark should be customized, so that only one such server is spawned for all the sessions.

Another problem you’ll likely run into (I see you alluded to it on the Ammonite gitter) is that the ScalaInterpreters will generate classes with similar names (like cmd1, cmd2, etc.), which is going to be a problem from the spark executors, which won’t be able to distinguish between the classes of the various interpreters. That may require some customizations in Ammonite… (to put all classes of a session in a custom package for example).

Read more comments on GitHub >

github_iconTop Results From Across the Web

manage conflict on java classpath - Stack Overflow
If you launch two separate instances of the JVM for the two programs, then don't use the same classpath! Isn't that obvious?
Read more >
Understanding Gradle #10 – Dependency Version Conflicts
... 0:26 Example : Producing a conflict ▶️ 2 :25 Effects of Gradle's default resolution behavior ▶️ 4:20 Different classpaths can have ...
Read more >
CLASSPATH in Java - GeeksforGeeks
Packages are used for: Preventing naming conflicts. For example, there can be two classes with the name Employee in two packages, college.staff.
Read more >
Adding Classes to the JAR File's Classpath
An Example. We want to load classes in MyUtils.jar into the class path for use in MyJar.jar. These two JAR files are in...
Read more >
JAR Hell, Part 1 (Compilation, Classpaths, and Libraries)
class files on the Classpath to be organized in a directory structure that matches their package hierarchy. So, a more realistic example of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found