Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
cmsc818fall2023projects
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Peter Keleher
cmsc818fall2023projects
Commits
44491ae0
Commit
44491ae0
authored
1 year ago
by
Peter J. Keleher
Browse files
Options
Downloads
Patches
Plain Diff
auto
parent
d0faf49d
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
p5.md
+158
-0
158 additions, 0 deletions
p5.md
with
158 additions
and
0 deletions
p5.md
0 → 100644
+
158
−
0
View file @
44491ae0
# RAFT
**v1.0**
## Due: Nov 19, 2023, 11:59:59 pm
## Setup
Download startup files
[
here
](
https://sedna.cs.umd.edu:818/projects/p5.tgz
)
.
## Part I: Failure-free operation
Implement raft as described in
[
Figure 2
](
https://sedna.cs.umd.edu/papers/raftFigure.pdf
)
of the
[
paper
](
https://sedna.cs.umd.edu/papers/raft.pdf
)
.
In particular:
-
implement the log
-
implement leader election
-
election timeouts
-
election restriction
-
appendentries
-
heartbeat
-
"catching up" a lagging follower
-
committing
-
informing the client
-
failure resilience
## Failures you need to handle
-
**Failure of a client:**
Client failures should not affect anything.
-
**Failure of the leader:**
Your system should detect failures, elect a
new leader (with
*server selection*
), and continue processing commands
from clients while a quorum is still available. For example: a 5-node
system should be able to recover from leader failure and continue
committing client operations.
-
**Slow followers:**
i.e. followers that are behind and need to be
"caught up" to the leader's log state. We will implement slow
followers by "pausing" containers use the command
`docker compose
pause <hostname>`
, and later un-pausing them.
## Command Syntax and Observable Output
Your server must be implemented in a single file at the top level named
`raft.go`
. Your client must be at the same level and named
`client.go`
. The only other files you need are
`Dockerfile`
and
`compose.yaml`
. You will also need the
`/pb`
subdirectory.
Your servers and clients are called as follows:
-
`raft.go -s <listen addr this server>,<server 0>,<server 1>,.... [-d]`
:
For example, we might start a server with
`go run raft.go -s localhost:8002,localhost:8000,localhost:8001,localhost:8002 -d`
.
This will be a single raft server instance that listens at address
`:8002`
, in a system consisting of three servers listening at
`localhost:8000`
(id 0),
`localhost:8001`
(id 1), and
`localhost:8002`
(id 2). The
`-s`
argument is mandatory, but the
`-d`
flag is not.
-
`client.go <leader> <command>`
: The "leader" is the presumed leader
of the system and the command is a simple uninterpreted string.
For example:
`go run client.go
localhost:8002 one`
submits the command "one" to
`localhost:8002`
. If this host is not the leader it returns
*false*
to the client, which prints this out. The RPC to the server will not
return until the command is committed, at which point the client
will print out "true".
### Raft Server Output
Your server may print anything with the
`-d`
flag. Without this flag,
however, it should print at most one line acknowledging when startup
is complete, and then only the following output:
---
At each heartbeat the leader should print:
```
AE [<match index array>] ci:<commit index>, <log len>[<term>-"<cmd>",....]
```
For example, your leader might print out:
```
AE [-1 -1] ci:4, 5[8-"L0",8-"one",8-"two",9-"L1",10-"L1"]
```
The matchindex is an integer slice described as in Figure 2. A
null/nil or "no match" is signified w/ a -1.
`ci`
is the current
leader commit index. The number before the left bracket ('[') is the
current log length. The entire log is then printed, surrounded by
brackets. Entries have format
`<term>-"<cmd>"`
, and are separated by
commas.
---
Followers should print the same, but use " " instead of "AE".
---
When a follower converts to candidacy it should print out
"election" alone on a line.
---
During catchup, a server should print a single line for each time it
goes backwards:
```
lastLogIndex <index> <(address:port)
```
---
There should be
**no other output**
, and this output must be followed exactly.
## Details
-
Use whatever timeouts you want during debugging, but you should use those in the startup file when submitting.
## Testing.
**Test 1 (10 pts)**
Two servers.
-
Two-server election, both have to agree.
-
Submit a command and verify that the client does not return until
the command has been committed and applied.
**Test 2 (10 pts)**
Continuing the above...
-
Kill the leader and immediately restart; the follower must
become the new leader because it's log is more up-to-date.
-
Commit another command and power-cycle the leader again, again verifying
that the follower, with it's more up-to-date log, become leader.
**Test 3 (10 pts)**
Three servers.
**Test 4 (10 pts)**
Compose w/ three servers.
**Test 5 (10 pts)**
Compose w/ three servers:
-
pause a follower, verify that the commands can still be committed
-
unpause the follower, verify that the follower catches up
**Test 6 (10 pts)**
Compose w/ three servers:
-
pause the leader, verify that the commands can still be committed
after new election
-
unpause the previous leader, verify that it rejoins, reverts to a follower, catches up
**Test 7 (10 pts)**
Compose w/ five servers.
```
> cli localhost:5501 put one
redirect localhost:5502
> cli localhost:5502 put one
term 1, index 0: commit "one"
```
Leader fails:
```
> cli localhost:5502 put two
redirect localhost:5501
> cli localhost:5502 put two
term 2, index 1: commit "two"
```
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment