Cassandra Sink - Error when consuming Avro messages
See original GitHub issueHi,
I am trying to use cassandra-sink connector to gather avro messages in a topic. Each message is a GenericRecord. I am taking the “unknown magic byte” error as shown below;
[2016-11-01 10:58:51,944] INFO WorkerSinkTask{id=cassandra-sink-products-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask:261) [2016-11-01 10:58:51,959] ERROR Task cassandra-sink-products-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:142) org.apache.kafka.connect.errors.DataException: Failed to deserialize data to Avro: at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:109) at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:356) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:226) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:170) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:142) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1 Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte! [2016-11-01 10:58:51,959] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:143) [2016-11-01 10:58:51,959] INFO Stopping Cassandra sink. (com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkTask:95) [2016-11-01 10:58:51,959] INFO Shutting down Cassandra driver session and cluster. (com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter:166) [2016-11-01 10:58:54,188] INFO Publish thread interrupted! (io.confluent.monitoring.clients.interceptor.MonitoringInterceptor:161) [2016-11-01 10:58:54,296] INFO Publishing Monitoring Metrics stopped for clientID=consumer-4 (io.confluent.monitoring.clients.interceptor.MonitoringInterceptor:173) [2016-11-01 10:58:54,296] INFO Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer:658) [2016-11-01 10:58:54,300] INFO Closed monitoring interceptor for client ID=consumer-4 (io.confluent.monitoring.clients.interceptor.MonitoringInterceptor:195)
I try both the official apache-kafka and confluent fork. The error is the same.
"io.confluent" % "kafka-avro-serializer" % "3.0.1",
"org.apache.kafka" % "kafka_2.11" % "0.10.0.1-cp1"
//"org.apache.kafka" % "kafka_2.11" % "0.10.0.1"
I can successfully feed the orders table (producer records are given from the console-producer), as explained in the example (http://docs.datamountaineer.com/en/latest/cassandra-sink.html). However, when I try using records in a topic, it does not work.
Does anyone have a solution for this problem? I investigated the problem in the web and try many things but could not find a working solution.
Thanks, Ferhat
Issue Analytics
- State:
- Created 7 years ago
- Comments:7
Top GitHub Comments
Hi, the problem is solved, actually it is because of the serializer settings. Default KafkaAvroSerializer should be used for both key and value serializers.
I am just putting the solution below which may help for upcoming visitors to that issue while searching the same problem. Thanks for all your help.
the relevant codes can be found in my repo