question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Urgent - Worker restart , always have a number of duplicate message in topic

See original GitHub issue

Hi everyone , I’m implementing uReplicator with latest master branch ( commit 70ac3404bb4517756c6813d3962b77edc7700019 ). As the another commit on master , after Worker restart, always have lots of message duplicate in Destination topic .

Here is overview my step :

step1: Build the latest code

step2: build Docker image with this Dockerfile

FROM openjdk:8-jre

COPY confd-0.15.0-linux-amd64 /usr/local/bin/confd
COPY uReplicator/uReplicator-Distribution/target/uReplicator-Distribution-pkg /uReplicator
COPY uReplicator/config uReplicator/config
COPY confd-new /etc/confd
#I use confd to generate configuration file in Container 
COPY entrypoint-new.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh && \
    chmod +x /usr/local/bin/confd && \
    chmod +x /uReplicator/bin/*.sh && 

ENTRYPOINT [ "/entrypoint.sh" ]

Step3: prepare 2 topic:

  • Source topic : 500 message & keep rising to target 1000 message.
  • Destination topic : Blank topic - 0 message

step4: apply deployment on Kubernetes with config :

  • Controller : 1 pod : resource require: 100m CPU - 1500Mi memory
  • Worker : 2 pod : resource require: 100m CPU - 1500Mi memory

clusters.properties

kafka.cluster.zkStr.cluster1=src-zk:2181
kafka.cluster.servers.cluster1=src-kafka:9092
kafka.cluster.zkStr.cluster2=dst-zk:2181
kafka.cluster.servers.cluster2=dst-kafka:9092

consumer.properties

zookeeper.connect=src-zk:2181
bootstrap.servers=src-kafka:9092
zookeeper.connection.timeout.ms=30000
zookeeper.session.timeout.ms=30000
group.id=kloak-mirrormaker-test
consumer.id=kloakmms01-sjc1
socket.receive.buffer.bytes=1048576
fetch.message.max.bytes=8388608
queued.max.message.chunks=5
key.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer
value.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer
auto.offset.reset=smallest

dstzk.properties

enable=true
zkServer=src-zk:2181

helix.properties : disable instanceId because of i’m using more than 1 Worker

zkServer=src-zk:2181
#instanceId=testHelixMirrorMaker01
helixClusterName=testMirrorMaker
federated.deployment.name=uReplicator-Tests

producer.properties

bootstrap.servers=dst-kafka:9092
client.id=kloak-mirrormaker-test

Step4: check number of message on both server

Src & Dst topic : 600 message

Step5 : Delete 1 worker POD kubernetes by : kubectl delete pod … ( no --force command )

result : 1 - old POD is Terminating. Src & Dst keep sync up on time : 750 -750 on both topic

2 - a few second later: new POD is UP Src topic : keep rising to 900 -> 95- -> 1000 Dst topic : stay at 750 message

3 - a minutes later : Src topic : 1000 message Dst topic : more than 1000 message : sometime 1200 message , sometime 1100-1300…etc ( the result is different in every test )

Note : if no Worker is down. The replication result is perfect . But in this case in everytime is test , the result is different and uReplicator is not stable. @yangy0000 @Pawka @DAddYE @hparra @chirayuk @dentafrice CC to anyone Who contributed to this application . Please let me know/correct me if something went wrong .

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:21 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
Technoboy-commented, Oct 29, 2019

yes ,perfect .

2reactions
dungnt081191commented, Nov 13, 2019

@Technoboy- hey bro , the problem is : kill process or kill -9 process . I did this way with kill -9

I’ve tried with ‘kill processId’ and here is result : have commit offset log? And It’s Work !!!

2019-10-29T07:45:23.181+0000: Total time for which application threads were stopped: 0.0002563 seconds, Stopping threads took: 0.0000400 seconds
2019-10-29T07:45:40.183+0000: Total time for which application threads were stopped: 0.0001666 seconds, Stopping threads took: 0.0000360 seconds
[2019-10-29 07:45:45,083] INFO Start clean shutdown. (kafka.mirrormaker.WorkerInstance:66)
[2019-10-29 07:45:45,086] INFO Shutting down consumer thread. (kafka.mirrormaker.WorkerInstance:66)
[2019-10-29 07:45:45,089] INFO [mirrormaker-thread-HelixMirrorMaker-1572335099430] mirrormaker-thread-HelixMirrorMaker-1572335099430 shutting down (kafka.mirrormaker.WorkerInstance$MirrorMakerThread:66)
2019-10-29T07:45:45.184+0000: Total time for which application threads were stopped: 0.0001369 seconds, Stopping threads took: 0.0000343 seconds
2019-10-29T07:45:51.185+0000: Total time for which application threads were stopped: 0.0001361 seconds, Stopping threads took: 0.0000360 seconds
[2019-10-29 07:45:51,811] INFO [mirrormaker-thread-HelixMirrorMaker-1572335099430] Mirror maker thread stopped (kafka.mirrormaker.WorkerInstance$MirrorMakerThread:66)
[2019-10-29 07:45:51,811] INFO [mirrormaker-thread-HelixMirrorMaker-1572335099430] Mirror maker thread shutdown complete (kafka.mirrormaker.WorkerInstance$MirrorMakerThread:66)
[2019-10-29 07:45:51,811] INFO Flushing last batches and commit offsets. (kafka.mirrormaker.WorkerInstance:66)
[2019-10-29 07:45:51,812] INFO Flushing producer. (kafka.mirrormaker.WorkerInstance:66)
[2019-10-29 07:45:51,816] INFO Committing offsets. (kafka.mirrormaker.WorkerInstance:66)
[2019-10-29 07:45:51,823] INFO Shutting down consumer connectors. (kafka.mirrormaker.WorkerInstance:66)
[2019-10-29 07:45:51,823] INFO Connector is now shutting down! (kafka.mirrormaker.KafkaConnector:66)
[2019-10-29 07:45:51,838] INFO [CompactConsumerFetcherManager-1572335099655] Stopping leader finder thread (kafka.mirrormaker.CompactConsumerFetcherManager:66)
[2019-10-29 07:45:51,839] INFO [kloak-mirrormaker-dc-test2_kloakmms01-dc-leader-finder-thread]: Shutting down (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2019-10-29 07:45:51,839] INFO [kloak-mirrormaker-dc-test2_kloakmms01-dc-leader-finder-thread]: Stopped (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2019-10-29 07:45:51,839] INFO [kloak-mirrormaker-dc-test2_kloakmms01-dc-leader-finder-thread]: Shutdown completed (kafka.mirrormaker.CompactConsumerFetcherManager$LeaderFinderThread:66)
[2019-10-29 07:45:51,840] INFO [CompactConsumerFetcherManager-1572335099655] Stopping all fetchers (kafka.mirrormaker.CompactConsumerFetcherManager:66)
[2019-10-29 07:45:51,844] INFO [CompactConsumerFetcherManager-1572335099655] All connections stopped (kafka.mirrormaker.CompactConsumerFetcherManager:66)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Effective strategy to avoid duplicate messages in apache ...
The short answer is, no. What you're looking for is exactly-once processing. While it may often seem feasible, it should never be relied ......
Read more >
How to Handle Duplicate Messages and Message Ordering in ...
There're a couple of ways to handle duplicate messages: write idempotent message handler.
Read more >
NAVTEX On Ships: Working, Types Of Messages And ...
NAVTEX is a device used on-board the vessels to provide short range Maritime Safety Information in coastal waters automatically.
Read more >
RFC 793: Transmission Control Protocol
There have been many contributors to this work both in terms of concepts and in ... Control Protocol Philosophy currently reading there is...
Read more >
Change notification settings on iPhone - Apple Support (IN)
You can turn app notifications on or off, have notifications play a sound, ... For many apps, you can also set a notification...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found