Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Jobs are failing during the schema registration while writing to kafka from a batch dataframe

See original GitHub issue

Hi,

We have built an ETL tool(https://github.com/homeaway/datapull) for moving data across many data platforms and we are trying to include Kafka as part of our platform.

we are trying to use this library for writing data to Kafka from a batch dataframe and couldn’t get it to work because of the following error.

Exception in thread "main" java.lang.NoSuchFieldError: FACTORY
        at org.apache.avro.Schemas.toString(Schemas.java:36)
        at org.apache.avro.Schemas.toString(Schemas.java:30)
        at io.confluent.kafka.schemaregistry.avro.AvroSchema.canonicalString(AvroSchema.java:140)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:206)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:268)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:244)
        at io.confluent.kafka.schemaregistry.client.SchemaRegistryClient.register(SchemaRegistryClient.java:42)
        at za.co.absa.abris.avro.read.confluent.SchemaManager.register(SchemaManager.scala:77)
        at za.co.absa.abris.avro.read.confluent.SchemaManager.$anonfun$getIfExistsOrElseRegisterSchema$1(SchemaManager.scala:124)
        at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.java:23)
        at scala.Option.getOrElse(Option.scala:189)
        at za.co.absa.abris.avro.read.confluent.SchemaManager.getIfExistsOrElseRegisterSchema(SchemaManager.scala:124)
        at za.co.absa.abris.config.ToSchemaRegisteringConfigFragment.usingSchemaRegistry(Config.scala:135)
        at za.co.absa.abris.config.ToSchemaRegisteringConfigFragment.usingSchemaRegistry(Config.scala:131)
        at org.example.App$.main(App.scala:37)
        at org.example.App.main(App.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)```

This is a bare minimum project which we have written for testing this: https://github.com/markovarghese/simplespark before we implement this as part of our tool.

Any immediate help would be greatly appreciated as we are trying to onboard a few users who are on-hold for this functionality. and please feel free to contact us for any questions.

Thanks for looking into this.

Regards,
Srini

Issue Analytics

State:
Created 3 years ago
Comments:12

Top GitHub Comments

1reaction

cerveadacommented, Nov 13, 2020

@markovarghese version 4.0.1 that should fix this issue was released.

Would you mind testing if it works as expected?

0reactions

cerveadacommented, Nov 18, 2020

Thanks for help. I will close this issue, feel free to open a new one if needed.

Top Results From Across the Web

Integrating Spark Structured Streaming with the Confluent ...

Show activity on this post. I'm using a Kafka Source in Spark Structured Streaming to receive Confluent encoded Avro records. I intend to...

Structured Streaming Programming Guide - Apache Spark

Streaming DataFrames can be created through the DataStreamReader interface (Scala/Java/Python docs) returned by SparkSession.readStream() . In R, with the read.

Schema Registry Overview - Confluent Documentation

In Kafka primary election, the Schema ID is always based off the last ID that was written to Kafka store. During a primary...

Running Streaming Jobs Once a Day For 10x Cost Savings

The ETL jobs may (in practice, often will) fail. If your job fails, then you need to ensure that the output of your...

Delta Table Data Types

Delta Table Data TypesBuild & deploy a simple pipeline to listen to Kafka and push them to delta tables. Let's start creating a...