java.lang.NoSuchMethodError: io.confluent.kafka.serializers.subject.TopicNameStrategy.subjectName(Ljava/lang/String;ZLorg/apache/avro/Schema;)Ljava/lang/String;
See original GitHub issueHi I am trying to deserialize a Confluent Kafka topic in a pyspark structured streaming program using the procedure as described in https://github.com/AbsaOSS/ABRiS/blob/master/documentation/python-documentation.md
from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
import statistics
from pyspark.sql.column import Column, _to_java_column
KAFKA_TOPIC_NAME_CONS = "TEST_AVRO_TOPIC"
KAFKA_BOOTSTRAP_SERVERS_CONS = "kfk-bro1:9093"
def from_avro(col, config):
"""
avro deserialize
:param col (PySpark column / str): column name "key" or "value"
:param config (za.co.absa.abris.config.FromAvroConfig): abris config, generated from abris_config helper function
:return: PySpark Column
"""
jvm_gateway = SparkContext._active_spark_context._gateway.jvm
abris_avro = jvm_gateway.za.co.absa.abris.avro
return Column(abris_avro.functions.from_avro(_to_java_column(col), config))
def from_avro_abris_config(config_map, topic, is_key):
"""
Create from avro abris config with a schema url
:param config_map (dict[str, str]): configuration map to pass to deserializer, ex: {'schema.registry.url': 'http://localhost:8081'}
:param topic (str): kafka topic
:param is_key (bool): boolean
:return: za.co.absa.abris.config.FromAvroConfig
"""
jvm_gateway = SparkContext._active_spark_context._gateway.jvm
scala_map = jvm_gateway.PythonUtils.toScalaMap(config_map)
return jvm_gateway.za.co.absa.abris.config \
.AbrisConfig \
.fromConfluentAvro() \
.downloadReaderSchemaByLatestVersion() \
.andTopicNameStrategy(topic, is_key) \
.usingSchemaRegistry(scala_map)
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", KAFKA_BOOTSTRAP_SERVERS_CONS) \
.option("subscribe", KAFKA_TOPIC_NAME_CONS) \
.option("kafka.security.protocol", "SSL") \
.option("kafka.ssl.endpoint.identification.algorithm", "") \
.option("kafka.ssl.truststore.location","C:\\ssl\\truststore.jks") \
.option("kafka.ssl.truststore.password","Password123") \
.option("kafka.ssl.keystore.location", "C:\\ssl\\keystore.jks") \
.option("kafka.ssl.keystore.password","Password123") \
.option("kafka.ssl.key.password","Password123") \
.option("startingOffsets", "earliest") \
.load()
from_avro_abris_settings = from_avro_abris_config({'schema.registry.url': 'http://sr1:8082'}, KAFKA_TOPIC_NAME_CONS, True)
df2 = df.withColumn("parsed", from_avro("value", from_avro_abris_settings))
However, i am getting the following error in the function from_avro_abris_config:
return jvm_gateway.za.co.absa.abris.config \ <<== Error
.AbrisConfig \
.fromConfluentAvro() \
.downloadReaderSchemaByLatestVersion() \
.andTopicNameStrategy(topic, is_key) \
.usingSchemaRegistry(scala_map)
File "C:\Users\admin\PycharmProjects\spark_hello_world\main.py", line 37, in from_avro_abris_config
return jvm_gateway.za.co.absa.abris.config \
File "C:\spark\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1304, in __call__
File "C:\spark\python\lib\pyspark.zip\pyspark\sql\utils.py", line 128, in deco
File "C:\spark\python\lib\py4j-0.10.9-src.zip\py4j\protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o56.andTopicNameStrategy.
: java.lang.NoSuchMethodError: io.confluent.kafka.serializers.subject.TopicNameStrategy.subjectName(Ljava/lang/String;ZLorg/apache/avro/Schema;)Ljava/lang/String;
at za.co.absa.abris.avro.registry.SchemaSubject$.usingTopicNameStrategy(SchemaSubject.scala:43)
at za.co.absa.abris.config.FromStrategyConfigFragment.andTopicNameStrategy(Config.scala:220)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
...
...
...
There is a suggestion here to use previous version of confluent libraries but that does not make any differences. I am using Confluent 6.0.1/Spark 3.0.2 along with Abris 4.2.0
Can anyone point out where could be the issue ?
Regards
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Top Results From Across the Web
Error while using Json Schema along with Schema registry ...
It gives some code related bug. Has anyone tried JsonSchema successfully. java.lang.NoSuchMethodError: io.confluent.kafka.serializers.subject.
Read more >java.lang.NoSuchMethodError: org.apache.avro.Schema. ...
Hi I am upgrading an existing application using a kafka-client + Avro schemas to the following versions: <avro.version>1.11.0</avro.version> ...
Read more >schema-registry is not starting
(io.confluent.kafka.schemaregistry.storage.KafkaStore:336) ... NoSuchMethodError: javax.servlet.ServletContext.getFilterRegistration(Ljava/lang/String ...
Read more >Issue with kafka connect custom name strategy
TopicNameStrategy key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy (io.confluent.connect.avro.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I removed the repeating JAR and invoked the pyspark shell with the following command and tit worked.
pyspark --packages org.apache.avro:avro:1.8.2,org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.2,org.apache.spark:spark-sql_2.12:3.0.2,za.co.absa:abris_2.12:4.1.0,io.confluent:kafka-avro-serializer:5.3.4,org.apache.kafka:kafka-clients:2.6.0,org.apache.spark:spark-token-provider-kafka-0-10_2.12:3.0.2,io.confluent:kafka-schema-registry-client:5.3.4 --repositories https://packages.confluent.io/maven/
Spark 3.0.2 Abris 4.2.0 Confluent Kafka 6.0.1
Thanks
io.confluent:kafka-avro-serializer:5.3.4
is the correct library that has the correct method mentioned in the error.You have
io.confluent:kafka-schema-serializer
twice there, the later one might be overriding the first one.