Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Jobs are failing during the schema registration while writing to kafka from a batch dataframe

See original GitHub issue


We have built an ETL tool( for moving data across many data platforms and we are trying to include Kafka as part of our platform.

we are trying to use this library for writing data to Kafka from a batch dataframe and couldn’t get it to work because of the following error.

Exception in thread "main" java.lang.NoSuchFieldError: FACTORY
        at org.apache.avro.Schemas.toString(
        at org.apache.avro.Schemas.toString(
        at io.confluent.kafka.schemaregistry.avro.AvroSchema.canonicalString(
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(
        at io.confluent.kafka.schemaregistry.client.SchemaRegistryClient.register(
        at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$
        at scala.Option.getOrElse(Option.scala:189)
        at org.example.App$.main(App.scala:37)
        at org.example.App.main(App.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)```

This is a bare minimum project which we have written for testing this: before we implement this as part of our tool.

Any immediate help would be greatly appreciated as we are trying to onboard a few users who are on-hold for this functionality. and please feel free to contact us for any questions.

Thanks for looking into this.


Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12

github_iconTop GitHub Comments

cerveadacommented, Nov 13, 2020

@markovarghese version 4.0.1 that should fix this issue was released.

Would you mind testing if it works as expected?

cerveadacommented, Nov 18, 2020

Thanks for help. I will close this issue, feel free to open a new one if needed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Integrating Spark Structured Streaming with the Confluent ...
Show activity on this post. I'm using a Kafka Source in Spark Structured Streaming to receive Confluent encoded Avro records. I intend to...
Read more >
Structured Streaming Programming Guide - Apache Spark
Streaming DataFrames can be created through the DataStreamReader interface (Scala/Java/Python docs) returned by SparkSession.readStream() . In R, with the read.
Read more >
Schema Registry Overview - Confluent Documentation
In Kafka primary election, the Schema ID is always based off the last ID that was written to Kafka store. During a primary...
Read more >
Running Streaming Jobs Once a Day For 10x Cost Savings
The ETL jobs may (in practice, often will) fail. If your job fails, then you need to ensure that the output of your...
Read more >
Delta Table Data Types
Delta Table Data TypesBuild & deploy a simple pipeline to listen to Kafka and push them to delta tables. Let's start creating a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Post

No results found

github_iconTop Related Hashnode Post

No results found