question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spark Streaming Runtime Issue with zipkin.Span KryoSerialization

See original GitHub issue

zipkin.Span does not seem to be KryoSerializable, or even Kryo.JavaSerializable with Spark Streaming- as discussed with @adriancole on Gitter.

Using Kryo 2.21, Spark 1.6.1, Zipkin 1.7.0.

Using Kryo’s default JavaSerializer

Unit Test Passes

    Kryo kryo = new Kryo();
    kryo.register(Span.class, new JavaSerializer());
    Span testSpan = Span.builder()
        .traceId(1L)
        .name("normal")
        .id(2L)
        .annotations(new ArrayList<>())
        .binaryAnnotations(new ArrayList<>())
        .debug(false).build();
    byte[] actual = SerializationUtils.serialize(testSpan);
    Input input = new Input(actual);
    Object kryoSpan = kryo.readClassAndObject(input);
    assertEquals(testSpan, Span.class.cast(kryoSpan));
  }

Runtime Fails

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 28.0 failed 1 times, most recent failure: Lost task 1.0 in stage 28.0 (TID 216, localhost): com.esotericsoftware.kryo.KryoException: 
Error during Java deserialization.
    at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:42)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
    at com.twitter.chill.TraversableSerializer.read(Traversable.scala:43)
    at com.twitter.chill.TraversableSerializer.read(Traversable.scala:21)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
    at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
    at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
    at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:201)

Caused by: java.lang.ClassNotFoundException: zipkin.Span$SerializedForm
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:628)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
    at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:40)

Using custom serializer for zipkin.Span

Custom Serializer:

  public static class ZipkinSpanInternalSerializer
      extends com.esotericsoftware.kryo.Serializer<Span> {
    @Override
    public void write(Kryo kryo, Output output, Span span) {
      output.writeBytes(Codec.THRIFT.writeSpan(span));
      kryo.writeObject(output, span);
    }
    @Override
    public Span read(Kryo kryo, Input input, Class<Span> aClass) {
      System.out.println("READING");
      return Codec.THRIFT.readSpan(input.getBuffer());
    }
  }

Unit Test Passes:

    Kryo kryo = new Kryo();
    kryo.register(Span.class, new ZipkinSpanSerializer.ZipkinSpanInternalSerializer());
    Span testSpan = Span.builder()
        .traceId(1L)
        .name("normal")
        .id(2L)
        .annotations(new ArrayList<>())
        .binaryAnnotations(new ArrayList<>())
        .debug(false).build();

    byte[] actual = SerializationUtils.serialize(testSpan);

    Input input = new Input(actual);
    Object kryoSpan = kryo.readClassAndObject(input);
    assertEquals(testSpan, Span.class.cast(kryoSpan));
  }

Runtime Fails:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 24.0 failed 1 times, most recent failure: Lost task 1.0 in stage 24.0 (TID 184, localhost):
 java.lang.IndexOutOfBoundsException: Index: 9, Size: 0
    at java.util.ArrayList.rangeCheck(ArrayList.java:653)
    at java.util.ArrayList.get(ArrayList.java:429)
    at com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
    at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:773)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:727)
    at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
    at org.apache.spark.serializer.DeserializationStream.readKey(Serializer.scala:169)
    at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:201)
    at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:198)
    at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
    at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
    at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
    at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:152)
    at org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:58)
    at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:83)
    at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:98)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:5

github_iconTop GitHub Comments

github_iconTop Results From Across the Web

EsotericSoftware/kryo - Gitter
unit tests fail when i use kryo's JavaSerializer, unit tests don't fail when I use my custom serializer, but in either case it...
Read more >
KryoSerializer exception in Spark Streaming JAVA
Hi, I'm implementing KryoSerializer for my custom class. Here is class public class ImpressionFactsValue implements KryoSerializable { private int hits;
Read more >
Spring Cloud
Spring Cloud provides tools for developers to quickly build some of the common patterns in distributed systems (e.g. configuration management, ...
Read more >
org.apache.kafka.streams.state.Stores Java Examples
Source Project: zipkin-storage-kafka Author: openzipkin-contrib File: ... selectKey((windowed, spans) -> windowed.key()); // Downstream to traces topic ...
Read more >
springcloud原版官方文档
A Brief History of Spring's Data Integration Journey; 25. ... Application Error Handling; 29.4.2. ... Zipkin Stream Span Consumer; 64.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found