[Bug] io.confluent.connect.avro.AvroConverter does not work as a key/value converter in KafkaConnectors
See original GitHub issueDescribe the bug This has got to be either a known bug, or I’m doing something stupid. But I’m trying to use io.confluent.connect.avro.AvroConverter for key and value (de)serialization. I’ve tried to use it for a few different Kafka Connect connectors, but to simulate it without much complexity, I used the s3 connector, and I can get it to mess up every time.
I’ve tried downloading this and putting it in my plugins directory, but it still doesn’t seem to work: https://www.confluent.io/hub/confluentinc/kafka-connect-avro-converter
I started with a fairly straightforward configuration that worked. I copied the “kafka-connect-s3” directory from the confluent platform 5.5 (is there possibly a compatibility issue here??) directory. Also copied “kafka-connect-storage-common” (there’s a dependency there…)
Everything seems to generally work pretty well until I try to use the AvroConverter. Looking in “kafka-connect-storage-common” there this jar: kafka-connect-avro-converter-5.5.0.jar which should be all I need… But all I get is this:
Tasks:
Id: 0
State: FAILED
Trace: java.lang.NoClassDefFoundError: io/confluent/connect/avro/AvroConverterConfig
at io.confluent.connect.avro.AvroConverter.configure(AvroConverter.java:64)
at org.apache.kafka.connect.runtime.isolation.Plugins.newConverter(Plugins.java:266)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:417)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
In messing around with this configuration with other connectors, I’ve been abled to get: java.lang.NoClassDefFoundError for AbstractConfig as well sometimes… and then add those confluent common jars, then it goes back to not NoClassDefFoundError for AvroConverterConfig.
Something must be going on that I’m not seeing here.
Thanks!
To Reproduce I use the following KafkaConnector configuration:
apiVersion: kafka.strimzi.io/v1alpha1
kind: KafkaConnector
metadata:
name: kafka-connector-s3-avro
labels:
strimzi.io/cluster: kafkaconnect-cluster
spec:
class: io.confluent.connect.s3.S3SinkConnector
tasksMax: 1
config:
format.class: io.confluent.connect.s3.format.json.JsonFormat
s3.compression.type: gzip
partitioner.class: io.confluent.connect.storage.partitioner.HourlyPartitioner
topics: avrokafkamessagestopic
s3.region: us-east-2
s3.bucket.name: avrokafkamessages
flush.size: 1
storage.class: io.confluent.connect.s3.storage.S3Storage
locale: en-US
timezone: UTC
schemas.enable: false
key.converter: io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url: http://schema-registry-release-cp-schema-registry:8081
value.converter: io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url: http://schema-registry-release-cp-schema-registry:8081
I know that what I’m doing isn’t supported out of the box, so I’ve followed the many tutorials on how to create your own kafkaconnect image here’s my docker file:
FROM strimzi/kafka-connect:0.11.4-kafka-2.1.0
USER root:root
COPY ./connect-plugins/ /opt/kafka/plugins/
USER 1001
I add the following directories to my ./connect-plugins directory: kafka-connect-s3 kafka-connect-storage-common
from confluent 5.5 platform: share/java
And my KafkaConnect configuration:
apiVersion: kafka.strimzi.io/v1beta1
kind: KafkaConnect
metadata:
name: kafkaconnect-cluster
annotations:
# # use-connector-resources configures this KafkaConnect
# # to use KafkaConnector resources to avoid
# # needing to call the Connect REST API directly
strimzi.io/use-connector-resources: "true"
spec:
version: 2.4.0
replicas: 1
bootstrapServers: kafka-cluster-kafka-external-bootstrap:9094
image: ecr-repo/kafkaconnectors:tagname
config:
group.id: kafkaconnect-cluster
offset.storage.topic: kafkaconnect-cluster-offsets
offset.storage.replication.factor: 1
config.storage.topic: kafkaconnect-cluster-configs
config.storage.replication.factor: 1
status.storage.topic: kafkaconnect-cluster-status
status.storage.replication.factor: 1
externalConfiguration:
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-creds
key: awsAccessKey
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-creds
key: awsSecretAccessKey
metrics:
lowercaseOutputName: true
lowercaseOutputLabelNames: true
rules:
- pattern : "kafka.connect<type=connect-worker-metrics>([^:]+):"
name: "kafka_connect_connect_worker_metrics_$1"
- pattern : "kafka.connect<type=connect-metrics, client-id=([^:]+)><>([^:]+)"
name: "kafka_connect_connect_metrics_$1_$2"
Expected behavior Would like to be able to use the AvroConverter. I can’t be the only one, it must be something I’m doing wrong.
Environment (please complete the following information):
- Strimzi version: 0.17.0
- Installation method: Strimzi operator
- Kubernetes cluster: Kubernetes 1.16
- Infrastructure: Amazon EKS
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (4 by maintainers)
Top GitHub Comments
@timkalanai thanks mate - thats greatly appreciated, with your help I was able to get it working - thanks again
here’s the Dockerfile i use
I think I figured it out, and it’s weird…
I think my problem was with the way that kafka connect scans for “Connectors” versus “Converters”. There’s a lot of classloading magic in that Plugins file mentioned above. I’m probably not going to do an explanation justice because I don’t quite get it myself.
But as it scans through the folder structure (you can see it loading connectors and convertors in the logs as kafka connect starts up), it looks for Converters and Connectors in parallel. That being said, I think because they try to isolate connectors within each directory in the plugins folder, the dependencies for each have to be within each directory (which is what we thought).
Confluent has laid out their folder structure a little differently. There’s a commons folder with common libs (I needed common-config.jar and common-config.jar, but there are others in there too). There’s also a “kafka-storage-common” that has AvroConverter.
Confluent puts a symlink in the connector directories to make sure that they have access to the right converters. Think when connect is traversing each folder, it does a deep traversal, and a symlink looks like a directory.
I had a number of issues. But I finally solved it by keeping the symlink, and putting the confluent-common jars into the Kafka-storage-common folder. That way, when AvroConverter is detected in the kafka-storage-common folder, it loads in the same class loader as the common jars in the same folder.
And any connector I need “AvroConverter” for, I add a symlink to the “kafka-storage-common” directory.
I know it’s convoluted. Maybe I’m just too tired and not seeing straight, but everything seems to be working now. Hope this helps some person in the future.