question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

short lived leader election reported when starting a raft cluster

See original GitHub issue

The issue can be reproduced with the example project, by building it locally and adding a cluster.bat to the output folder that has the following lines:

START /B .\RaftNode.exe tcp 3262 node2 > node2.log
START /B .\RaftNode.exe tcp 3263 node3 > node3.log
START /B .\RaftNode.exe tcp 3264 node4 > node4.log

Run: del -r node* && .\cluster.bat

I can reproduce it with 2-6 attempts in a windows x64 machine. The issue was originally found in a raspberry pi (arm - linux).

The leader prints this its log:

New cluster leader is elected. Leader address is 127.0.0.1:3260
Term of local cluster member is 1. Election timeout 00:00:00.1590000
Consensus cannot be reached
Term of local cluster member is 1. Election timeout 00:00:00.1590000
New cluster leader is elected. Leader address is 127.0.0.1:3267
Term of local cluster member is 2. Election timeout 00:00:00.1590000
Accepting value 500
Accepting value 1000
Accepting value 1500
Accepting value 2000
...

Other nodes print (+ the leader prints the save messages)

New cluster leader is elected. Leader address is 127.0.0.1:3267
Term of local cluster member is 2. Election timeout 00:00:00.1700000
Accepting value 500
Accepting value 1000
Accepting value 1500
Accepting value 2000
...

When done with the run, the RaftNode processes need to be killed via task manager since they are running in the background.

_Originally posted by @freddyrios in https://github.com/dotnet/dotNext/discussions/167#discussioncomment-6062222_

Issue Analytics

  • State:closed
  • Created 4 months ago
  • Comments:24 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
freddyrioscommented, Jun 7, 2023

works great, thanks!

ran the example reproduction at least 15 times and in most cases it properly elects the leader in term 1. The other case was when all nodes became candidates close to each others and rejected each others votes and then elected a leader in term 2 (as expected in raft).

1reaction
freddyrioscommented, Jun 6, 2023

FYI I pulled the latest develop after my last message and reproduced it again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Raft Algorithm, Explained - Leader Election
When a new election is conducted, the candidate increases the Term number by 1. The updated term will be propagated to all nodes...
Read more >
Implementing Raft: Part 1 - Elections - Eli Bendersky's website
Suppose that A is the leader, the starting term is 1 and the cluster is happily chugging along. A is sending heartbeat AE...
Read more >
Can RAFT as a protocol support only leader election?
No, a log is not necessary. The leader-follower-candidate state machine and timeouts are enough for a host to know its the leader.
Read more >
Making sense of the RAFT Distributed Consensus Algorithm
Performance is the reason why most of the real life Raft use cases are about leader election, replicating configuration changes, cluster ...
Read more >
Understanding the Raft consensus algorithm: an academic ...
Raft works by electing a leader in the cluster. ... Raft divides time into terms of arbitrary length, each beginning with an election....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found