(DotNext.Net.Cluster) Leader does not receive IRaftCluster.LeaderChanged event when downgrading to follower
See original GitHub issueHello,
I am currently trying to run the 4.2.0-beta.1
version and I have come across a specific issue regarding the log entry writing with Raft and the events associated with it
- Have a single node start a standalone cluster (first node)
- Add a second member with AddMember (second node)
- Kill the process of node 2
- Start the process of node 2 again
After step 3 and inspecting my logs, it seems like the leader steps down to follower but does not fire the IRaftCluster.LeaderChanged
event. IRaftCluster.Members
will still report the local (Remote == false) node 1 as leader on node 1, but attempting to write to the log now will result in
System.InvalidOperationException: The local cluster member is not a leader
at DotNext.Threading.Tasks.ValueTaskCompletionSource`1.GetResult(Int16 token) in /_/src/DotNext.Threading/Threading/Tasks/ValueTaskCompletionSource.T.cs:line 272
at DotNext.Threading.Tasks.ValueTaskCompletionSource`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) in /_/src/DotNext.Threading/Threading/Tasks/ValueTaskCompletionSource.T.cs:line 279
at DotNext.Net.Cluster.Consensus.Raft.LeaderState.ReplicationCallback.Invoke() in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/LeaderState.Replication.cs:line 177
--- End of stack trace from previous location ---
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.ReplicateAsync[TEntry](TEntry entry, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/RaftCluster.cs:line 818
IClusterMember.MemberStatusChanged
will correctly fire and set node 2 to Unavailable during step 3.
This behavior was not present before the upgrade to 4.2.0-beta.1
.
After step 4 is done, (on node 1) IClusterMember.MemberStatusChanged
will correctly fire and mark node 2 as available again. However, node 1 is still unable to write to the log as it seemingly isn’t leader anymore. Only some time after this step, node 1 will correctly fire IRaftCluster.LeaderChanged
to NO LEADER, and then fire it again but WITH A LEADER.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8
I am using HTTP transport.
Closing this issue due to lack of activity.