Skip to content

Leader liveness and rotation

HagarMeir edited this page May 26, 2019 · 4 revisions

A Byzantine tolerant consensus algorithm must ensure the liveness of the state machine replication operation in an event where the leader has crashed, is unreachable, or is malicious, for example, censoring client requests.

Malicious leaders may not include client requests in their proposals, or may try to suspend progress by not sending proposals to nodes. To that end, whenever a valid request arrives to a follower - it starts a timer and waits for the request to be included in some proposal. If the timer times out, it sends the request to the leader and resets the timer. Upon a second timeout - the follower suspects the leader, and decides to begin a view change. The use of two timeouts is as suggested by BFT-SMaRt paper. In the first timeout we suspect the client not submitting the request to the leader and only in the second timeout we suspect the leader.

However, even if the leader is honest it might crash or simply be disconnected from the rest of the nodes, and therefore cannot broadcast new proposals. If the leader crashes or its communication with the cluster is severed at a time where there is no application traffic, it is imperative to detect this as fast as possible and elect a new leader to prevent availability from being impaired. Hence, the leader is expected to periodically send heartbeat messages to the followers. If a follower hasn't received a heartbeat from the leader within a timely manner, it suspects the leader has crashed or is unavailable and resolves to start a view change.

During the view change, followers broadcast votes to move to a new view, with a new leader.

Let the ID of the current leader be denoted i. the next leader to be elected j is the lowest j such that j>i , or if no such ID exists, then the lowest ID (wrap-around the IDs).

Clone this wiki locally