S3 Sink Connector Support of many Topics
See original GitHub issueIssue Guidelines
Are multiple topics also supported? I’ve tried different spellings.
for example: connect.s3.kcql=INSERT INTO kafka-backup SELECT * FROM testtopic, testtopic2 STOREAS JSON
WITH_FLUSH_COUNT = 1
I keep getting the following error message:
java.lang.IllegalStateException: fatal: Can’t find fileNamingStrategy in config
Can you help me?
Thank you
What version of the Stream Reactor are you reporting this issue for? 2.8
What is your connector properties configuration (my-connector.properties)?
name=S3SinkConnector
connector.class=io.lenses.streamreactor.connect.aws.s3.sink.S3SinkConnector
tasks.max=1
key.converter.schemas.enable=false
connect.s3.kcql=INSERT INTO kafka-backup SELECT * FROM testtopic, testtopic2 STOREAS JSON
WITH_FLUSH_COUNT = 1
connect.s3.aws.region=eu-central-1
topics=testtopic, testopic2
schema.enable=false
errors.log.enable=true
key.converter.schemas.enable=false
value.converter.schemas.enable=false
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
Please provide full log files (redact and sensitive information)
[2022-05-20 10:07:46,074] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),0) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,103] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),4) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,137] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),1) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,163] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),3) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,199] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),2) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,302] ERROR WorkerSinkTask{id=S3SinkConnector-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:184) java.lang.IllegalStateException: fatal: Can’t find fileNamingStrategy in config
nonFatal:
Fatal TPs: HashSet(Set(TopicPartition(Topic(testtopic2),1)), Set(TopicPartition(Topic(testtopic2),0)), Set(TopicPartition(Topic(testtopic2),3)), Set(TopicPartition(Topic(testtopic2),4)), Set(TopicPartition(Topic(testtopic2),2))) at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.handleErrors(S3SinkTask.scala:125) at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.open(S3SinkTask.scala:212) at org.apache.kafka.connect.runtime.WorkerSinkTask.openPartitions(WorkerSinkTask.java:635) at org.apache.kafka.connect.runtime.WorkerSinkTask.access$1000(WorkerSinkTask.java:71) at org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance.onPartitionsAssigned(WorkerSinkTask.java:700) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsAssigned(ConsumerCoordinator.java:293) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:430) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:449) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:365) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:508) at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1261) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1230) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1210) at org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:452) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:324) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:182) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:231) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) [2022-05-20 10:07:46,307] INFO [Consumer clientId=connector-consumer-S3SinkConnector-0, groupId=connect-S3SinkConnector] Revoke previously assigned partitions testtopic-0, testtopic2-3, testtopic2-4, testtopic-2, testtopic-1, testtopic-4, testtopic-3, testtopic2-0, testtopic2-1, testtopic2-2 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:307)
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:8 (2 by maintainers)
Top GitHub Comments
We have a use case where we need wildcard support. We can use
topics.regex
to have Kafka Connect subscribe to multiple topics, but it seems like the limitation is with KCQL.What we really would like to do is be able to specify write a single KCQL statement that would work with any matched topic based on the regex pattern, allowing us to partition based on topic name, known common fields, etc.
In fact, having to specify the topic name in the KCQL itself seems a bit redundant given the
topics
andtopics.regex
options. Maybe a special meta topic name could be introduced (e.g.__topic__
) to indicate that the KCQL should be applied to any topic?@davidsloan true, but not wildcards… Use case: I want to backup all the “domain” topics and I being able to say
domain-.*
would be great.