Can you read from multiple sources kafka clusters and write to multiple kafka clusters?
See original GitHub issueHello,
We are doing a POC for uReplicator to be used across multiple regions/sites. Let me share the configs:
------- Controller Configs -------
cat start-controller-with-args.sh
#!/bin/bash
export hoa_srcCluster='ZK-CLUSTER-1:2181/srcCluster'
export asd_desCluster='ZK-CLUSTER-2/destCluster'
export hoa_zk='ZK-CLUSTER-1:2181'
args="-helixClusterName testMirrorMaker"
args="${args} -destKafkaZkPath ${asd_desCluster}"
args="${args} -srcKafkaZkPath ${hoa_srcCluster}"
args="${args} -zookeeper ${hoa_srcCluster}"
args="${args} -port 10000 -mode auto"
args="${args} -enableAutoWhitelist true"
args="${args} -autoRebalanceDelayInSeconds 120 -backUpToGit false"
args="${args} -localBackupFilePath ~/backup_"
echo " /bin/bash ./start-controller.sh startMirrorMakerController ${args}"
/bin/bash ./start-controller.sh startMirrorMakerController ${args}
------- Worker configs -------
cat /app/uReplicator/config/consumer.properties | grep -vE "^#"
zookeeper.connect=ZK-CLUSTER-1:2181/srcCluster
zookeeper.connection.timeout.ms=30000
zookeeper.session.timeout.ms=30000
group.id=kloak-mirrormaker-test
consumer.id=kloakmms01-sjc1
partition.assignment.strategy=roundrobin
socket.receive.buffer.bytes=1048576
fetch.message.max.bytes=8388608
queued.max.message.chunks=5
auto.offset.reset=smallest
cat /app/uReplicator/config/producer.properties | grep -vE '^#'
bootstrap.servers=KAFKA-CLUSTER-2:9092
client.id=kloak-mirrormaker-test
producer.type=async
compression.type=none
serializer.class=kafka.serializer.DefaultEncoder
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
We would like to aggregate the data in multiple kafka cluster/topics: Can we read from 2 clusters (Cluster1 and Cluster2) and write to both clusters in different topics? Or, do you suggest read from one cluster (cluster1) and write to two clusters (cluster1 and cluster2) in different topics?
Is it possible, if yes, how?
For this scenario what is the best practices/solution?
Issue Analytics
- State:
- Created 5 years ago
- Comments:15 (7 by maintainers)
Top Results From Across the Web
Is it better to split Kafka clusters? | Red Hat Developer
A mind map for Apache Kafka cluster segregation strategies shows the concerns that can drive a multiple-cluster setup.
Read more >Is it possible to for multiple kafka connect cluster to read from ...
I'm using SpoolDirCsvSourceConnector to load CSV data into one Kafka topic. My CSV input file is around 3- ...
Read more >Managing Topics across Multiple Kafka Clusters | 6.3.x
You can distribute messages across multiple clusters. It can be handy to have a copy of one or more topics from other Kafka...
Read more >Managing Multi-Cluster Kafka Connect and KSQL with Control ...
The Connect clusters are run by separate teams: one ingesting data from a source, the other taking transformed data and streaming it to...
Read more >Kafka Clusters Architecture 101: A Comprehensive Guide
Kafka with more than one broker is called Kafka Cluster. It can be expanded and used without downtime. Apache Kafka Clusters are used...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sorry for the confusion. Let’s see if this is more clear: one urep cluster can only handle one src and one dst cluster. In your case, since you only have two clusters, you won’t get much benefit from Federated-uReplicator branch: In master branch, you need to set up two urep clusters yourself. In Federated-uReplicator branch, you still to set up two federation cluster (one in each region because you want to run workers in each region), then the federation layer will automatically create one urep clusters in each region for you when you whitelist the topic.
However, in the following case, Federated-uReplicator branch will gain you much benefit: You have cluster1,2 in US east and cluster 3,4 in US west. In master brach, you need to manually create 12 clusters yourself. In Federated-uReplicator branch, you only need to 4 federation cluster (one in each cluster). all those 12 clusters in master branch will be created automatically for you.
Federated-uReplicator has a federation layer on topic of master branch. It automatically set up cluster1->cluster2 and cluster2->cluster1 pipelines for you. However, which pipeline can only have one src and one dst cluster. e.g. In Federated-uReplicator branch, you need to set up one manager cluster, one controller cluster, one worker cluster, but cluster1->cluster2 and cluster2->cluster1 replication will be handled by manager. In master branch, you need to set up two controller cluster, two worker cluster, one handles cluster1->cluster2 and the other handles cluster2->cluster1.