question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

S3 Sink Connector Support of many Topics

See original GitHub issue

Issue Guidelines

Are multiple topics also supported? I’ve tried different spellings.

for example: connect.s3.kcql=INSERT INTO kafka-backup SELECT * FROM testtopic, testtopic2 STOREAS JSON WITH_FLUSH_COUNT = 1

I keep getting the following error message:

java.lang.IllegalStateException: fatal: Can’t find fileNamingStrategy in config

Can you help me?

Thank you

What version of the Stream Reactor are you reporting this issue for? 2.8

What is your connector properties configuration (my-connector.properties)?

name=S3SinkConnector connector.class=io.lenses.streamreactor.connect.aws.s3.sink.S3SinkConnector tasks.max=1 key.converter.schemas.enable=false connect.s3.kcql=INSERT INTO kafka-backup SELECT * FROM testtopic, testtopic2 STOREAS JSON WITH_FLUSH_COUNT = 1 connect.s3.aws.region=eu-central-1 topics=testtopic, testopic2 schema.enable=false errors.log.enable=true key.converter.schemas.enable=false value.converter.schemas.enable=false key.converter=org.apache.kafka.connect.storage.StringConverter value.converter=org.apache.kafka.connect.storage.StringConverter

Please provide full log files (redact and sensitive information)

[2022-05-20 10:07:46,074] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),0) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,103] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),4) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,137] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),1) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,163] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),3) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,199] INFO [S3SinkConnector] Seeked offset None for TP TopicPartition(Topic(testtopic),2) (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:144) [2022-05-20 10:07:46,302] ERROR WorkerSinkTask{id=S3SinkConnector-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:184) java.lang.IllegalStateException: fatal: Can’t find fileNamingStrategy in config

nonFatal:

Fatal TPs: HashSet(Set(TopicPartition(Topic(testtopic2),1)), Set(TopicPartition(Topic(testtopic2),0)), Set(TopicPartition(Topic(testtopic2),3)), Set(TopicPartition(Topic(testtopic2),4)), Set(TopicPartition(Topic(testtopic2),2))) at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.handleErrors(S3SinkTask.scala:125) at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.open(S3SinkTask.scala:212) at org.apache.kafka.connect.runtime.WorkerSinkTask.openPartitions(WorkerSinkTask.java:635) at org.apache.kafka.connect.runtime.WorkerSinkTask.access$1000(WorkerSinkTask.java:71) at org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance.onPartitionsAssigned(WorkerSinkTask.java:700) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsAssigned(ConsumerCoordinator.java:293) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:430) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:449) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:365) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:508) at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1261) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1230) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1210) at org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:452) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:324) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:182) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:231) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) [2022-05-20 10:07:46,307] INFO [Consumer clientId=connector-consumer-S3SinkConnector-0, groupId=connect-S3SinkConnector] Revoke previously assigned partitions testtopic-0, testtopic2-3, testtopic2-4, testtopic-2, testtopic-1, testtopic-4, testtopic-3, testtopic2-0, testtopic2-1, testtopic2-2 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:307)

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
jalazizcommented, Oct 3, 2022

We have a use case where we need wildcard support. We can use topics.regex to have Kafka Connect subscribe to multiple topics, but it seems like the limitation is with KCQL.

What we really would like to do is be able to specify write a single KCQL statement that would work with any matched topic based on the regex pattern, allowing us to partition based on topic name, known common fields, etc.

In fact, having to specify the topic name in the KCQL itself seems a bit redundant given the topics and topics.regex options. Maybe a special meta topic name could be introduced (e.g. __topic__) to indicate that the KCQL should be applied to any topic?

1reaction
AlexeyRagacommented, Jun 8, 2022

@davidsloan true, but not wildcards… Use case: I want to backup all the “domain” topics and I being able to say domain-.* would be great.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Amazon S3 Sink connector for Confluent Cloud Quick Start
Data Format with or without a Schema: The connector supports input data from Kafka topics in Avro, JSON Schema, Protobuf, JSON (schemaless), or...
Read more >
amazon s3 - Can I map multiple buckets with multiple topics in ...
1 Answer 1 ... The S3 bucket name is specified per-connector, so you will need to create one connector per bucket. Note that...
Read more >
Amazon S3 sink connector
This example shows how to use the Confluent Amazon S3 sink connector and the AWS CLI to create an Amazon S3 sink connector...
Read more >
Kafka Connect S3 Examples - Supergloo -
In this Kafka Connect S3 tutorial, let's demo multiple Kafka S3 integration examples. We'll cover writing to S3 from one topic and also...
Read more >
Support multiple schemas in a topic · Issue #247 - GitHub
Kafka supports multiple types of schema in one topic with this PR: confluentinc/schema-registry#680. This introduced a problem for S3 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found