A practical implementation of a distributed key-value store and matching engine built upon OpenRaft. This project demonstrates how to implement persistent storage for Raft logs and snapshots on disk using Sled as the underlying storage engine.
Note: For a more detailed guide in Chinese, please refer to GUIDE_cn.md.
- Features
- Architecture Overview
- Getting Started
- Running the Cluster
- Project Structure
- Storage Implementation
- Cluster Management
- API Reference
- Persistent Raft Storage: Raft logs and state are persisted using Sled, a high-performance embedded key-value store
- Snapshot Management: State machine snapshots are stored as files on disk for efficient recovery
- Matching Engine: Includes a simple order book matching engine as an example application
- HTTP Server: Built with Actix-web providing both internal Raft APIs and external application APIs
- Smart Client: A Rust client that automatically tracks and redirects requests to the current leader
This project extends the example-raft-kv with persistent storage capabilities:
βββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
β (KV Store + Order Book Matching Engine)β
βββββββββββββββββββββββ¬ββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββ
β OpenRaft Core β
β (Consensus, Replication, Leadership) β
βββββββββββββββββββββββ¬ββββββββββββββββββββ
β
βββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββ
β β β
βββββββββββΌββββββββββ βββββββββββΌββββββββββ βββββββββββΌββββββββββ
β RaftStorage β β RaftNetwork β β HTTP Server β
β (Sled + Files) β β (Reqwest Client) β β (Actix-web) β
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
-
RaftStorage Implementation (
store/)- Raft logs and votes stored in Sled
- State machine in memory with file-based snapshots
- Automatic snapshot creation every 500 log entries
-
Network Layer (
network/)- Internal Raft RPC for replication and voting
- Connection pooling for efficient node-to-node communication
- Management APIs for cluster administration
-
Matching Engine (
matchengine/)- Order book with bids and asks
- Example of a real-world state machine application
- Rust 1.60 or later
- Cargo
cargo buildNote: Append
--releasefor production builds, but this project is primarily intended as an example and not recommended for production use without further hardening.
The easiest way to see the cluster in action is to run the provided test script:
./test-cluster.shThis script demonstrates a 3-node cluster using only curl commands, showing the HTTP communication between client and cluster.
cargo testThis runs the same scenario as test-cluster.sh but using the Rust ExampleClient.
Start the first node:
./target/debug/raft-key-value --id 1 --http-addr 127.0.0.1:21001Start additional nodes (in separate terminals):
./target/debug/raft-key-value --id 2 --http-addr 127.0.0.1:21002
./target/debug/raft-key-value --id 3 --http-addr 127.0.0.1:21003curl -X POST http://127.0.0.1:21001/initThis initializes node 1 as the leader.
curl -X POST -H "Content-Type: application/json" \
-d '[2, "127.0.0.1:21002"]' \
http://127.0.0.1:21001/add-learner
curl -X POST -H "Content-Type: application/json" \
-d '[3, "127.0.0.1:21003"]' \
http://127.0.0.1:21001/add-learnercurl -X POST -H "Content-Type: application/json" \
-d '[1, 2, 3]' \
http://127.0.0.1:21001/change-membershipWrite data:
curl -X POST -H "Content-Type: application/json" \
-d '{"Set":{"key":"foo","value":"bar"}}' \
http://127.0.0.1:21001/writeRead data from any node:
curl -X POST -H "Content-Type: application/json" \
-d '"foo"' \
http://127.0.0.1:21002/readYou should be able to read the data from any node!
The project also includes test.sh for more convenient cluster management:
# Start a node
./test.sh start-node 1
# Build the cluster
./test.sh build-cluster
# Check metrics
./test.sh metrics 1
# Clean up
./test.sh cleansrc/
βββ bin/
β βββ main.rs # Server entry point
βββ client.rs # Example Raft client
βββ lib.rs # Library with server setup
βββ app.rs # Application state
βββ matchengine/
β βββ mod.rs # Order book matching engine
βββ network/
β βββ api.rs # Application HTTP endpoints
β βββ management.rs # Admin HTTP endpoints
β βββ raft.rs # Raft internal RPC endpoints
β βββ raft_network_impl.rs # RaftNetwork implementation
β βββ mod.rs
βββ store/
βββ mod.rs # State machine definition
βββ store.rs # Snapshot file I/O operations
βββ config.rs # Storage configuration
The RaftStorage trait implementation is the core of this project. It handles:
- Raft State: Stored in Sled using dedicated keys
- Logs: Stored in Sled with log index as key
- State Machine: In-memory with file-based snapshots
- Votes: Persisted in Sled to ensure election safety
let mut config = Config::default().validate().unwrap();
config.snapshot_policy = SnapshotPolicy::LogsSinceLast(500);
config.max_applied_log_to_keep = 20000;
config.install_snapshot_timeout = 400;- A snapshot is created every 500 log entries
- Up to 20,000 log entries are kept before purging
- Snapshot files are stored on disk with metadata in the filename
On startup, the store:
- Loads the latest snapshot from disk
- Replays any remaining logs from Sled
- Restores the state machine to the latest state
Adding a node to a cluster involves 3 steps:
- Write the node info through the Raft protocol to storage
- Add as Learner to let it start receiving replication data from the leader
- Change membership to promote the learner to a full voting member
Note: Raft itself does not store node addresses. This implementation stores node information in the storage layer, and the network layer references the store to lookup target node addresses.
POST /raft-append- Append entries RPCPOST /raft-snapshot- Install snapshot RPCPOST /raft-vote- Request vote RPC
POST /init- Initialize a single-node clusterPOST /add-learner- Add a learner nodePOST /change-membership- Change cluster membershipGET /metrics- Get Raft metrics
POST /write- Write data to the state machinePOST /read- Read data from the state machine (local read)POST /consistent_read- Consistent read (goes through leader)
- Optimize serialization (replace JSON with Protobuf/Avro)
- Improve network layer with gRPC
- Add matching result distribution via message queues
- Implement more matching algorithms
- Add comprehensive testing and benchmarking
- Enhance client library
This project is a fork from example-raft-kv in the OpenRaft project.
Special thanks to the Databend community and Zhang Yanpo for their support.
Please refer to the original OpenRaft project for licensing information.