question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

java.lang.NoSuchMethodError: io.confluent.kafka.serializers.subject.TopicNameStrategy.subjectName(Ljava/lang/String;ZLorg/apache/avro/Schema;)Ljava/lang/String;

See original GitHub issue

Hi I am trying to deserialize a Confluent Kafka topic in a pyspark structured streaming program using the procedure as described in https://github.com/AbsaOSS/ABRiS/blob/master/documentation/python-documentation.md

from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
import statistics
from pyspark.sql.column import Column, _to_java_column


KAFKA_TOPIC_NAME_CONS = "TEST_AVRO_TOPIC"
KAFKA_BOOTSTRAP_SERVERS_CONS = "kfk-bro1:9093"

def from_avro(col, config):
    """
    avro deserialize

    :param col (PySpark column / str): column name "key" or "value"
    :param config (za.co.absa.abris.config.FromAvroConfig): abris config, generated from abris_config helper function
    :return: PySpark Column
    """
    jvm_gateway = SparkContext._active_spark_context._gateway.jvm
    abris_avro = jvm_gateway.za.co.absa.abris.avro
    return Column(abris_avro.functions.from_avro(_to_java_column(col), config))

def from_avro_abris_config(config_map, topic, is_key):
    """
    Create from avro abris config with a schema url

    :param config_map (dict[str, str]): configuration map to pass to deserializer, ex: {'schema.registry.url': 'http://localhost:8081'}
    :param topic (str): kafka topic
    :param is_key (bool): boolean
    :return: za.co.absa.abris.config.FromAvroConfig
    """
    jvm_gateway = SparkContext._active_spark_context._gateway.jvm
    scala_map = jvm_gateway.PythonUtils.toScalaMap(config_map)
    return jvm_gateway.za.co.absa.abris.config \
        .AbrisConfig \
        .fromConfluentAvro() \
        .downloadReaderSchemaByLatestVersion() \
        .andTopicNameStrategy(topic, is_key) \
        .usingSchemaRegistry(scala_map)

df = spark \
        .readStream \
        .format("kafka") \
        .option("kafka.bootstrap.servers", KAFKA_BOOTSTRAP_SERVERS_CONS) \
        .option("subscribe", KAFKA_TOPIC_NAME_CONS) \
        .option("kafka.security.protocol", "SSL") \
        .option("kafka.ssl.endpoint.identification.algorithm", "") \
        .option("kafka.ssl.truststore.location","C:\\ssl\\truststore.jks") \
        .option("kafka.ssl.truststore.password","Password123") \
        .option("kafka.ssl.keystore.location", "C:\\ssl\\keystore.jks")  \
        .option("kafka.ssl.keystore.password","Password123") \
        .option("kafka.ssl.key.password","Password123") \
        .option("startingOffsets", "earliest") \
        .load()

from_avro_abris_settings = from_avro_abris_config({'schema.registry.url': 'http://sr1:8082'}, KAFKA_TOPIC_NAME_CONS, True)
df2 = df.withColumn("parsed", from_avro("value", from_avro_abris_settings))

However, i am getting the following error in the function from_avro_abris_config:

return jvm_gateway.za.co.absa.abris.config \  <<== Error 
        .AbrisConfig \
        .fromConfluentAvro() \
        .downloadReaderSchemaByLatestVersion() \
        .andTopicNameStrategy(topic, is_key) \
        .usingSchemaRegistry(scala_map)

File "C:\Users\admin\PycharmProjects\spark_hello_world\main.py", line 37, in from_avro_abris_config
    return jvm_gateway.za.co.absa.abris.config \
File "C:\spark\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1304, in __call__
  File "C:\spark\python\lib\pyspark.zip\pyspark\sql\utils.py", line 128, in deco
  File "C:\spark\python\lib\py4j-0.10.9-src.zip\py4j\protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o56.andTopicNameStrategy.
: java.lang.NoSuchMethodError: io.confluent.kafka.serializers.subject.TopicNameStrategy.subjectName(Ljava/lang/String;ZLorg/apache/avro/Schema;)Ljava/lang/String;
	at za.co.absa.abris.avro.registry.SchemaSubject$.usingTopicNameStrategy(SchemaSubject.scala:43)
	at za.co.absa.abris.config.FromStrategyConfigFragment.andTopicNameStrategy(Config.scala:220)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)
...
...
...

There is a suggestion here to use previous version of confluent libraries but that does not make any differences. I am using Confluent 6.0.1/Spark 3.0.2 along with Abris 4.2.0

Can anyone point out where could be the issue ?

Regards

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
DwijadasDeycommented, Jul 2, 2021

I removed the repeating JAR and invoked the pyspark shell with the following command and tit worked.

pyspark --packages org.apache.avro:avro:1.8.2,org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.2,org.apache.spark:spark-sql_2.12:3.0.2,za.co.absa:abris_2.12:4.1.0,io.confluent:kafka-avro-serializer:5.3.4,org.apache.kafka:kafka-clients:2.6.0,org.apache.spark:spark-token-provider-kafka-0-10_2.12:3.0.2,io.confluent:kafka-schema-registry-client:5.3.4 --repositories https://packages.confluent.io/maven/

Spark 3.0.2 Abris 4.2.0 Confluent Kafka 6.0.1

Thanks

0reactions
cerveadacommented, Jul 2, 2021

io.confluent:kafka-avro-serializer:5.3.4 is the correct library that has the correct method mentioned in the error.

You have io.confluent:kafka-schema-serializer twice there, the later one might be overriding the first one.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error while using Json Schema along with Schema registry ...
It gives some code related bug. Has anyone tried JsonSchema successfully. java.lang.NoSuchMethodError: io.confluent.kafka.serializers.subject.
Read more >
java.lang.NoSuchMethodError: org.apache.avro.Schema. ...
Hi I am upgrading an existing application using a kafka-client + Avro schemas to the following versions: <avro.version>1.11.0</avro.version> ...
Read more >
schema-registry is not starting
(io.confluent.kafka.schemaregistry.storage.KafkaStore:336) ... NoSuchMethodError: javax.servlet.ServletContext.getFilterRegistration(Ljava/lang/String ...
Read more >
Issue with kafka connect custom name strategy
TopicNameStrategy key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy (io.confluent.connect.avro.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found