Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
cmsc818fall2023projects
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Peter Keleher
cmsc818fall2023projects
Commits
80e8e3cf
Commit
80e8e3cf
authored
1 year ago
by
Peter J. Keleher
Browse files
Options
Downloads
Patches
Plain Diff
auto
parent
9aa33c16
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
p6.md
+204
-0
204 additions, 0 deletions
p6.md
with
204 additions
and
0 deletions
p6.md
0 → 100644
+
204
−
0
View file @
80e8e3cf
# Project 6: Supporting High-level Abstractions From a Shared Log
**v1.0**
<br>
**Due Dec 10**
## Setup
Download files
[
here
](
https://ceres.cs.umd.edu/818/projects/p6.tgz?1
)
.
## Overview
[
tango-raft
](
tangoRaft.jpg
)
!
This project will require you to build three
*
conflict-free replicated
data types
*
on top of the shared, replicated log you built in P5.
For all three types,
*writes*
are in the log,
*reads*
are not.
For example, with the
*intCRDT*
, an integer increment
conflictfree-replicated-data-type (CRDT), each increment is written to
the log. Reading the type consists of traversing the log and adding
all the increments.
With the
*transactional key store*
, writes to the KV are in the log,
reads are not. This is complicated by the transactional nature of the
KV. Like w/ Tango, we expect transactions to support strict
serializability. Writes from transactions that abort do not affect the
KV.
Finally, in the
*tree*
type, you will create a shared tree that
supports concurrent modifications via simple mutations.
You will build your tango support in a new
`p6/tango`
module. Your
applications will call the tango module API to access the shared log.
## Building and Testing the Tango Interface
Our "tango" implementation is based relatively closely on the the
Tango paper [1] we discussed in class. In particular, the paper
defines a Tango runtime abstraction that provides
`update_helper`
s and
`query_helper`
s. The former adds a command to the shared log, while
the latter brings a local object up-to-date with respect to the most
recent local view of the log.
We will dispense w/ the object wrapper and define:
```
func TangoQueryHelper(obj TangoObject) ()
func TangoUpdateHelper(obj TangoObject, cmd string) string ()
```
Our shared log commands are strings, so the update helper merely adds a string
to the log.
We also define a
*TangoObject*
:
```
type TangoObject interface {
Oid() int64 // object ID
Tid() int64 // transaction ID
Apply(data string)
}
```
All three of your abstractions will conform to the TangoObject
interface, allowing your tango system layer to be oblivious to the
application semantics.
## Details
### Changes to the RPC definitions
1.
`CommandRequest`
includes fields
`Oid`
and
`Tid`
.
1.
`LogEntry`
also includes
`Oid`
and
`Tid`
fields.
1.
New
`RetrieveCommitted`
RPC to retrieve all
*committed*
log entries
after a particular log slot.
### The *tango log*
The tango module keeps a local copy of the shared log, updated
whenever
`TangoQueryHelper()`
is called. Much of the time this log is
therefore only a prefix of the full shared log.
To make this more concrete, consider the workflow of
`intCRDT`
, with
implements reads and conflict-free increments (updates). Each
`intCRDT`
object consists only of it's object ID ("oid") and its state
("state"). An increment is written to the log via
`UpdateHelper`
,
which you will implement in the tango module.
`UpdateHelper`
packages the increment and the OID into a new
`pb.LogEntry`
, and sends
to the local raft instantiation via the
`Command()`
RPC, which has
been enhanced to allow both object and transaction IDs to be
specified.
`intCRDT`
is not transactional, so the TID can be left
blank (the "zero value" of an int is...0).
`Command()`
is synchronous, so it does not return until the increment
has been committed.
Reads of
`intCRDTs`
are implemented by calling
`QueryHelper`
,
parameterized by OID, which
has the following tasks:
-
Sync the local copy of the log w/ the shared version via the new
`RetrieveCommitted()`
raft RPC. The request specifies the library's
last local entry; the RPC returns everything after it.
-
Parse through each of the log entries, in order, calling
`TangoObject.Apply()`
for each entry updating the object w/ OID.
The object's value does not have to be returned because the
`Apply()`
calls
will have already updated the
`intCRDT`
.
The parser in
`intCRDT.go`
updates several objects, displaying the value
of each at the end. The simple input script
`scriptInt.1`
has two fields per line:
the
*oid*
and the increment to be applied to that object. If multiple
applications run the same script concurrently, the exact interleaving
is non-deterministic, but the final value should remain unchanged if
we run the same two copies multiple times.
### Transactional semantics
The above is relatively straightforward, but transactional semantics
require a bit more mechanism. First, every line in
`scriptKV.1`
consists of
an
*oid*
, a
*tid*
(transaction ID), and a command. The commands have
the following implications:
-
"START" (transaction): Each client application, together with it's
instantiation of the tango library, as assumed to be
single-threaded. Only a single transaction is in progress from the
client at any time. The Tango transactional implementation relies on
maintaining readsets for the current transaction. Since only one
local transaction can be active at a time, only one readset need be
maintained. This readset is re-initialized at each transaction start.
-
"FINISH" (transaction): The app signals a commit request by issuing
"FINISH" to the raft abstraction, annotated with the current readset.
-
"READ": calls
`QueryHelper()`
, and then returns the current value.
-
"SLEEP": sleeps an integer number of seconds.
-
All others are strings that should be copied verbatim to the oid as
new values for the associate object.
All transactional fates are determined independently, but
deterministically, by each tango library. Recall (from the paper),
that an object version can be specified by the index of the last log
slot that modified the object. The "FINISH" command is annotated with
the complete transactional readset by
`UpdateHelper()`
when called
with a transaction "FINISH". The readset specifies each object read
(as signaled by calls to
`QueryHelper`
), and the version seen by each
read. For example, "FINISH,2-3,5-6,5-9" says that objects "2" and "5"
were read during the transaction. The read of object 2 saw version 3,
while there were two distinct reads of object 5, seeing versions 6 and
9, respectively.
Transactions commit only if no read objects are subsequently modified
before the transaction attempts to commit. For example, the above
transaction would be aborted if object 2 is modified to version 14 by
a remote transaction
*before*
the local transaction finishes.
Read objects
*may*
be modified by the local transaction and re-read
with different result
*without*
causing the transaction to abort.
These semantics are implemented in the query helper, which downloads
the most recent shared log suffix and parses the log entries to
determine the fate of any new transactions. Once a transaction's fate
is determined, The "FINISH" is changed to either "COMMIT" or "ABORT"
*in the local copy of the log*
.
To summarize from the application point-of-view: transaction "starts"
and "finishes" are sent to the tango module via
`UpdateHelper()`
, but
the app sees no other application details, and is not part of the
determination any transaction's fate. The app objects are affected
only by
`Apply()`
calls. These calls are immediate for
non-transactional updates, but
*
calls for transactional updates are
delayed until a transaction is known to have committed
*
.
## Testing.
1.
I will test by running two copies of
`intCRDT.go`
against "scriptInt.1"
concurrently, multiple times. Each time the end result should be the
same regardless of interleaving.
2.
I will test the transaction KV store by running
`kv.go`
with
"scriptKV.1" with one app. As the third READ is seen, I will start
another instance of
`kv.go`
running "scriptKV.2", which should
cause transaction 2 to abort. Transactions 1 and 3 will commit.
3.
You should come up scripts, similar to the KV scripts, to test your
log-based tree implementation. Details should be in your README.md file.
## Random Details
-
The tree should support mutations such as:
-
add a child
-
move a child
-
delete a child
Transactional semantics allow these to be combined atomically.
## Submitting
Submit by pushing to your repository.
-
DO update the
`README.md`
to
reflect what works, and what does not.
-
Describe your tree
implementation, and specify how I can recreate your demonstration.
-
Upload a video
`demo.mp4`
that shows you demonstrating all of your functionality,
as if this is the only thing I see. Note that I
*will*
look at your
code, and attempt to duplicate some of the functionality shown in
your video, but your video should be complete. If you are on a
mac, please use the "HandBrake" app to remove some of the bloat
(default configuration is fine). Do NOT upload a
`.mov`
.
## Bibliography
```
[1] Balakrishnan, Mahesh, et al. "Tango: Distributed data structures
over a shared log." Proceedings of the twenty-fourth ACM symposium
on operating systems principles. 2013.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment