How choosing the Raft consensus algorithm saved us 3 months of development time
- Track: Lightning Talks
- Room: H.2215 (Ferrer)
- Day: Sunday
- Start: 11:20
- End: 11:35
Providing full data availability in a degraded cluster requires a mechanism for coordinating changes to cluster membership in an automated way.
While developing our GPL distributed storage solution we researched the state of the art of consensus algorithms to achieve fault-tolerance. Our first stop was Paxos, which turned out to be complex and lacking accurate and comprehensive specifications. Without a reference implementation, confusion reigns supreme, and analyzing all the different variants would have taken us at least 3 months. By contrast Raft defines a single clean way to implement reliable leader election. It controls responsiveness with autonomously made decisions of participation in the cluster, while providing data availability.
In this lightning talk I will recount the road bumps we hit while implementing the consensus algorithm and the headaches we had before we ditched Paxos. With this head start on Raft, I hope to save my fellow developers from the pain we went through!
Speakers
Robert Wojciechowski |