Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Byzantine Fault Tolerence #105

Open
traverseda opened this issue Feb 7, 2019 · 0 comments
Open

Byzantine Fault Tolerence #105

traverseda opened this issue Feb 7, 2019 · 0 comments

Comments

@traverseda
Copy link

Byzantine fault tolerence would allow you to create clusters with untrusted nodes. This means that you could open a service up to the public and anyone could participate. This is an important feature for things like distributed chat-rooms or crypto-currencies.

This might be significantly out of scope for PySyncObj, but it would be a very nice feature to have.

So let's go over some of the relevant features of a BFT raft from that research paper

  1. Message signatures:
    BFT Raft uses digital signatures
    extensively to authenticate messages and verify their in-
    tegrity. For example, the leader replicates client messages
    along with the client signatures. This prevents a Byzantine
    leader from modifying the message contents or forging
    messages. Client public keys are kept separate from replica
    public keys to enforce that only clients can send new valid
    commands, and only replicas can send valid Raft RPCs.

  2. Client intervention:
    BFT Raft allows clients to inter-
    rupt the current leadership if it fails to make progress. This
    allows BFT Raft to prevent Byzantine leaders from starving
    the system.

  3. Incremental hashing:
    Each replica in BFT Raft com-
    putes a cryptographic hash every time it appends a new entry
    to its log. The hash is computed over the previous hash and
    the newly appended log entry. A node can sign its last hash
    to prove that it has replicated the entirety of a log, and other
    servers can verify this quickly using the signature and the
    hash.

  4. Election verification:
    Once a node has become leader,
    its first AppendEntries RPC to each other node will contain
    a quorum of RequestVoteResponse RPCs that it received
    in order to become leader (this is sent before the leader
    accepts any new entries from clients). Nodes first verify
    that the leader has actually won an election by counting and
    validating each of these RequestVoteResponses. Subsequent
    AppendEntries RPCs in the same term to the same node
    need not include these votes, but a node can ask for them in
    its AppendEntriesResponse if necessary. This will happen if
    the replica restarted and no longer believes that the current
    leader won the election for that term.

  5. Commit verification:
    In order to safely commit entries
    as they are replicated, each AppendEntriesResponse RPC is
    broadcast to each other node, rather than just to the leader.
    Further, each node decides for itself when to increment its
    commit index, rather than the leader. It does this by keeping
    track of the AppendEntriesResponse RPCs received from the
    other nodes, which are signed and which include the incre-
    mental hash of the last entry in that node’s log. Once a node
    has received a quorum of matching AppendEntriesResponse
    RPCs from other nodes at a particular index, it considers
    that to be the new commit index, and discards any stored
    AppendEntriesResponse RPCs for previous indices.
    This differs from Raft in that the leader no longer has any
    special responsibility of coordinating commits. AppendEn-
    triesResponse RPCs become closer to the broadcast PRE-
    PARE message from PBFT. Each node can verify for itself
    that a quorum of nodes have prepared up to a particular index
    and have matching entries for that index and all previous
    indices by checking the incremental hash.

  6. Lazy Voters:
    A node does not grant a vote to a
    candidate unless it believes the current leader is faulty. A
    node comes to believe that its leader is faulty if it does
    not receive an AppendEntries RPC within its own election
    timeout, or it receives an UpdateLeader RPC from a client
    for that leader. This prevents nodes that start unnecessary
    elections from gaining the requisite votes to become leader
    and starve the system.

Outside of all that, the serializer would have to not be pickle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant