kafka-influxdb-sink cannot handle Avro type array anymore
See original GitHub issueIssue Guidelines
What version of the Stream Reactor are you reporting this issue for?
2.1.3
Are you running the correct version of Kafka/Confluent for the Stream reactor release?
Yes, with Confluent Kafka 5.5.2
Have you read the docs?
Yes
What is the expected behaviour?
I would expect kafka-influxdb-sink 2.1.3 to support Avro records of type array (see PR #522) In fact I contributed that PR in the past, but I didn’t include a test for it 😦
I can try to fix this with some guidance.
What was observed?
docker-compose exec schema-registry kafka-avro-console-producer --bootstrap-server broker:29092 --topic foo --property value.schema='{"type":"record", "name":"foo", "fields":[{"name":"bar","type":"string"}, {"name":"baz","type":{"type":"array","items":"float"}}]}'
{"bar": "John Doe","baz": [1,2,3]}
What is your Connect cluster configuration (connect-avro-distributed.properties)?
$docker-compose exec connect cat ./etc/schema-registry/connect-avro-distributed.properties
bootstrap.servers=localhost:9092
group.id=connect-cluster
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
config.storage.topic=connect-configs
offset.storage.topic=connect-offsets
status.storage.topic=connect-statuses
config.storage.replication.factor=1
offset.storage.replication.factor=1
status.storage.replication.factor=1
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
plugin.path=/usr/share/java,/usr/share/confluent-hub-components
What is your connector properties configuration (my-connector.properties)?
docker-compose run kafkaconnect config influxdb-sink
Creating angelofausti_kafkaconnect_run ... done
{
"connect.influx.db": "mydb",
"connect.influx.error.policy": "THROW",
"connect.influx.kcql": "INSERT INTO foo SELECT * FROM foo WITHTIMESTAMP sys_time()",
"connect.influx.max.retries": "10",
"connect.influx.password": "",
"connect.influx.retry.interval": "60000",
"connect.influx.timestamp": "sys_time()",
"connect.influx.url": "http://influxdb:8086",
"connect.influx.username": "-",
"connect.progress.enabled": "false",
"connector.class": "com.datamountaineer.streamreactor.connect.influx.InfluxSinkConnector",
"name": "influxdb-sink",
"tasks.max": "1",
"topics": "foo"
}
Please provide full log files (redact and sensitive information)
connect | [2021-01-26 23:12:40,756] INFO Empty list of records received. (com.datamountaineer.streamreactor.connect.influx.InfluxSinkTask)
connect | [2021-01-26 23:12:42,631] ERROR Encountered error Can't select field:'baz' because it leads to value:'[1.0, 2.0, 3.0]' (java.util.ArrayList)is
not a valid type for InfluxDb. (com.datamountaineer.streamreactor.connect.influx.writers.InfluxDbWriter)
connect | java.lang.RuntimeException: Can't select field:'baz' because it leads to value:'[1.0, 2.0, 3.0]' (java.util.ArrayList)is not a valid type for InfluxDb.
connect | at com.datamountaineer.streamreactor.connect.influx.converters.InfluxPoint$.writeField(InfluxPoint.scala:91)
connect | at com.datamountaineer.streamreactor.connect.influx.converters.InfluxPoint$.$anonfun$addValuesAndTags$6(InfluxPoint.scala:36)
connect | at scala.util.Success.flatMap(Try.scala:251)
connect | at com.datamountaineer.streamreactor.connect.influx.converters.InfluxPoint$.$anonfun$addValuesAndTags$5(InfluxPoint.scala:35)
connect | at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
connect | at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
connect | at scala.collection.immutable.List.foldLeft(List.scala:89)
connect | at com.datamountaineer.streamreactor.connect.influx.converters.InfluxPoint$.addValuesAndTags(InfluxPoint.scala:34)
connect | at com.datamountaineer.streamreactor.connect.influx.converters.InfluxPoint$.$anonfun$build$6(InfluxPoint.scala:23)
connect | at scala.util.Success.flatMap(Try.scala:251)
connect | at com.datamountaineer.streamreactor.connect.influx.converters.InfluxPoint$.build(InfluxPoint.scala:20)
connect | at com.datamountaineer.streamreactor.connect.influx.writers.InfluxBatchPointsBuilder.$anonfun$build$4(InfluxBatchPointsBuilder.scala:94)
connect | at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:273)
connect | at scala.collection.Iterator.foreach(Iterator.scala:943)
connect | at scala.collection.Iterator.foreach$(Iterator.scala:943)
connect | at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
connect | at scala.collection.IterableLike.foreach(IterableLike.scala:74)
connect | at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
connect | at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
connect | at scala.collection.TraversableLike.map(TraversableLike.scala:273)
connect | at scala.collection.TraversableLike.map$(TraversableLike.scala:266)
connect | at scala.collection.AbstractTraversable.map(Traversable.scala:108)
connect | at com.datamountaineer.streamreactor.connect.influx.writers.InfluxBatchPointsBuilder.$anonfun$build$3(InfluxBatchPointsBuilder.scala:94)
connect | at scala.Option.map(Option.scala:230)
connect | at com.datamountaineer.streamreactor.connect.influx.writers.InfluxBatchPointsBuilder.$anonfun$build$2(InfluxBatchPointsBuilder.scala:94)
connect | at scala.util.Success.flatMap(Try.scala:251)
connect | at com.datamountaineer.streamreactor.connect.influx.writers.InfluxBatchPointsBuilder.$anonfun$build$1(InfluxBatchPointsBuilder.scala:91)
connect | at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:273)
connect | at scala.collection.Iterator.foreach(Iterator.scala:943)
connect | at scala.collection.Iterator.foreach$(Iterator.scala:943)
connect | at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
connect | at scala.collection.IterableLike.foreach(IterableLike.scala:74)
connect | at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
connect | at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
connect | at scala.collection.TraversableLike.map(TraversableLike.scala:273)
connect | at scala.collection.TraversableLike.map$(TraversableLike.scala:266)
connect | at scala.collection.AbstractTraversable.map(Traversable.scala:108)
connect | at com.datamountaineer.streamreactor.connect.influx.writers.InfluxBatchPointsBuilder.build(InfluxBatchPointsBuilder.scala:88)
connect | at com.datamountaineer.streamreactor.connect.influx.writers.InfluxDbWriter.write(InfluxDbWriter.scala:45)
connect | at com.datamountaineer.streamreactor.connect.influx.InfluxSinkTask.$anonfun$put$2(InfluxSinkTask.scala:77)
connect | at com.datamountaineer.streamreactor.connect.influx.InfluxSinkTask.$anonfun$put$2$adapted(InfluxSinkTask.scala:77)
connect | at scala.Option.foreach(Option.scala:407)
connect | at com.datamountaineer.streamreactor.connect.influx.InfluxSinkTask.put(InfluxSinkTask.scala:77)
connect | at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:549)
connect | at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:329)
connect | at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232)
connect | at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:204)
connect | at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
connect | at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
connect | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
connect | at java.util.concurrent.FutureTask.run(FutureTask.java:266)
connect | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
connect | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
connect | at java.lang.Thread.run(Thread.java:748)
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
InfluxDB Sink connector does not support arrays in Avro ... - Jira
Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. (org.apache.kafka.connect.runtime.
Read more >Using Array of Records in AVRO with kafka connect
I have a problem with an array of records in AVRO while trying to use kafka connect FileStreamSinkConnector. config: bootstrap.servers=kafka:9092.
Read more >Azure Blob Storage Sink Connector for Confluent Platform
The Kafka Connect Microsoft Azure Blob Storage connector offers a variety of features: Exactly Once Delivery: Records that are exported using a deterministic ......
Read more >Apache Nifi Processors in version 1.12.1
Consumes messages from Apache Kafka specifically built against the Kafka 0.11.x Consumer API. The complementary NiFi processor for sending ...
Read more >Unable to decode avro data from Kafka with avro schema
I think I found an answer to my problem. I forgot to NOT NULL for the ROW in my ARRAY. Solution : .column("problem_field",...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I can confirm that the new array handling implementation works fine for me. I’ve built
kafka-connect-influxdb
from master and followed these steps produce an Avro encoded message with an array and verified that it is flattened in InfluxDB.This issue can be closed.
@afausti thanks!