Two implementations of the Raft consensus algorithm, using the Scala language and:
- a functional effect system (zio)
- Project Loom
Currently, the goal of this project is educational, not production usage.
If you're new to Scala, you might want to setup your environment first, installing Java and an IDE. This page might be helpful. To run the Loom version, you'll need Java 19 (currently a preview version is available). You can install and manage multiple Java versions e.g. using SDKMAN.
Saft's architecture and evaluation is covered in the Implementing Raft using a functional effect system article.
To run the ZIO variant of the provided in-memory Raft simulation, using 5 nodes, use the following:
sbt "zio/runMain saft.SaftSim"
You'll need to have sbt installed.
To run the Loom variant, use the following (note: Java 19+ is needed):
sbt "loom/runMain saft.SaftSim"
If you'd like to run the Loom variant e.g. from an IDE, make sure the following is included in the VM options:
--enable-preview --add-modules jdk.incubator.concurrent
The loom
and zio
modules contains a number of files which implement the Raft algorithm, along with an in-memory runnable simulation, an HTTP+json-based app and some tests.
Before getting to know the implementation, it's best to at least skim the Raft paper. There's a one-page summary of the algorithm on page 4.
Reading the files in the following order might be easiest (links mostly go to the Loom version, the structure of the ZIO one is the same):
- start with
domain.scala
, which defines some basic data types. We've got a representation of aNodeId
, which simply wraps a number. These ids are then used to identify and communicate between nodes. There are also some opaque types (newtypes), which at runtime are erased to their base representation, however at compile-time are distinct (here we're creating artificial subtypes ofInt
orString
). This includesTerm
, which counts the Raft terms, log indexes and log data. Finally, there are data structures which represent the content of the logs, such asLogEntry
, which combines data in the log with the term in which the entry was added (as described in the Raft paper). - these basic types are used to create representations of server state in
state.scala
. This file contains classes such asServerState
,LeaderState
etc., which correspond directly to the state that each server should store as described in the Raft paper. If there's anything extra, it is clearly commented with justification. The state classes contain methods which either update the state with new data (such as appending a new entry), check conditions (such as checking if a log entry can be applied given the current state), or compute views of the data (such as computing the commit index). These functions are always pure, that is side-effect-free, and in case of updates, return a new immutable representation of the state. - then, take a look at
messages.scala
. There, you can find the messages that are defined in the Raft protocol, such asRequestVote
orAppendEntries
, along with classes representing responses to these requests. The messages are classified basing on whether a client or a sever is the sender/recipient, and if the message is a request or response. These classification traits (ToServerMessage
,ResponseMessage
etc.) are then used to ensure that a message can only be used in appropriate context. - the implementation is event-driven, that is appropriate logic is run in response to events being placed on a queue. The events are defined in
ServerEvent
, and the queues are abstracted using theComms
interface. The events are pretty straightforward, representing the possibility that a node can either receive a timeout (for example an election timeout, if there was no communication from the leader for the specified amount of time), or a request/response. - finally, with these prerequisites, it should be enough to study the
Node
implementation itself. That's where the main logic of the Raft algorithm is implemented. The code is commented using excerpts from Raft "cheat-sheet", the one-page summary from the Raft paper, page 4. The implementation of a node reads events from the queue in a loop, first matching on the event type (timeout / request received / response received), and then matching on the current role of the node (follower / candidate / leader). Each handler follows roughly the same steps: first the state is changed, yielding a new state instance (e.g.ServerState
orLeaderState
). Then, side-effects are run, such as restarting the timer, or sending out messages to other nodes (requesting votes or appending entries). After a handler completes, the state is persisted and the response is sent back, if any. - to see the implementation in action, it's easiest to run the
SaftSim
app, which runs a number of nodes in-memory, using in-memory communication and in-memory persistence. Using the provided console interface new entries can be added, nodes can be interrupted and started again. Alternatively, you can start an http+json-based version, usingSaftHttp
. This will require starting the three nodes separately, though. - there are three files which we haven't yet discussed, though they should be self-explanatory.
Conf
groups configuration values such as the number of nodes and timeouts. ThePersistence
interface is used to save the part ofServerState
that should be persistent. Finally,StateMachine
is where replicated, committed log entries are applied. A simple implementation applying the given function in the background (so that applying a log entry doesn't block the node itself) is provided. - the ZIO test code might also be interesting, as it uses a manually-managed clock, allowing running time-based tests in a faster and predictable way, eliminating possible flakiness
node
- one server running the Raft protocolmessage
- sent between server nodes, or between a client and a server. Includes the messages defined by the Raft protocol:RequestVote
,AppendEntries
,NewEntry
event
- processed by nodes sequentially, mediated by an event queue. Events include outside communication (receiving amessage
as a request or response), or a timeout (e.g. election timeout)state
- the persistent and volatile data associated with each noderole
, as defined in the Raft paper, along with some implementation-specific elements. Includes for example the log itself (of which entries are being replicated), the commit index, or the known leader id.role
- follower, candidate or leader; determines the logic that a particular node runs in response to incomingevent
s.state machine
- where the replicated and committed log entries are applied; the ultimate destination of all log entries