-
Peter J. Keleher authoredPeter J. Keleher authored
Project 6: Supporting High-level Abstractions From a Shared Log
v1.0
Due Dec 10
Setup
Download files here.
Overview
This project requires you to build three conflict-free replicated data types on top of the shared, replicated log you built in P5. For all three types, writes are in the log, reads are not.
For example, with the intCRDT, an integer increment conflictfree-replicated-data-type (CRDT), each increment is written to the log. Reading the type consists of traversing the log and adding all the increments.
With the transactional key store, writes to the KV are in the log, reads are not. This is complicated by the transactional nature of the KV. Like w/ Tango, we expect transactions to support strict serializability. Writes from transactions that abort do not affect the KV.
Finally, in the tree type, you will create a shared tree that supports concurrent modifications via simple mutations. You build this entirely from scratch.
You will build your tango support in a new p6/tango
module. Your
applications will call the tango module API to access the shared log.
Building and Testing the Tango Interface
Our "tango" implementation is based relatively closely on the the
Tango paper [1] we discussed in class. In particular, the paper
defines a Tango runtime abstraction that provides update_helper
s and
query_helper
s. The former adds a command to the shared log, while
the latter brings a local object up-to-date with respect to the most
recent local view of the log.
We will dispense w/ the object wrapper and define:
func TangoQueryHelper(obj TangoObject) ()
func TangoUpdateHelper(obj TangoObject, cmd string) string ()
Our shared log commands are strings, so the update helper merely adds a string to the log.
We also define a TangoObject:
type TangoObject interface {
Oid() int64 // object ID
Tid() int64 // transaction ID
Apply(data string)
}
All three of your abstractions will conform to the TangoObject interface, allowing your tango system layer to be oblivious to the application semantics.
Details
Changes to the RPC definitions
-
CommandRequest
includes fieldsOid
andTid
. -
LogEntry
also includesOid
andTid
fields. - New
RetrieveCommitted
RPC to retrieve all committed log entries after a particular log slot.
The tango log
The tango module keeps a local copy of the shared log, updated
whenever TangoQueryHelper()
is called. Much of the time this log is
therefore only a prefix of the full shared log.
To make this more concrete, consider the workflow of intCRDT
, with
implements reads and conflict-free increments (updates). Each
intCRDT
object consists only of it's object ID ("oid") and its state
("state"). An increment is written to the log via UpdateHelper
,
which you will implement in the tango module.
UpdateHelper
packages the increment and the OID into a new pb.LogEntry
, and sends
to the local raft instantiation via the Command()
RPC, which has
been enhanced to allow both object and transaction IDs to be
specified. intCRDT
is not transactional, so the TID can be left
blank (the "zero value" of an int is...0).
Command()
is synchronous, so it does not return until the increment
has been committed.
Reads of intCRDTs
are implemented by calling QueryHelper
,
parameterized by OID, which
has the following tasks:
- Sync the local copy of the log w/ the shared version via the new
RetrieveCommitted()
raft RPC. The request specifies the library's last local entry; the RPC returns everything after it. - Parse through each of the log entries, in order, calling
TangoObject.Apply()
for each entry updating the object w/ OID. The object's value does not have to be returned because theApply()
calls will have already updated theintCRDT
.
The parser in intCRDT.go
updates several objects, displaying the value
of each at the end. The simple input script scriptInt.1
has two fields per line:
the oid and the increment to be applied to that object. If multiple
applications run the same script concurrently, the exact interleaving
is non-deterministic, but the final value should remain unchanged if
we run the same two copies multiple times.
Transactional semantics
The above is relatively straightforward, but transactional semantics
require a bit more mechanism. First, every line in scriptKV.1
consists of
an oid, a tid (transaction ID), and a command. The commands have
the following implications:
- "START" (transaction): Each client application, together with it's instantiation of the tango library, as assumed to be single-threaded. Only a single transaction is in progress from the client at any time. The Tango transactional implementation relies on maintaining readsets for the current transaction. Since only one local transaction can be active at a time, only one readset need be maintained. This readset is re-initialized at each transaction start.
- "FINISH" (transaction): The app signals a commit request by issuing "FINISH" to the raft abstraction, annotated with the current readset.
- "READ": calls
QueryHelper()
, and then returns the current value. - "SLEEP": sleeps an integer number of seconds.
- All others are strings that should be copied verbatim to the oid as new values for the associate object.
All transactional fates are determined independently, but
deterministically, by each tango library. Recall (from the paper),
that an object version can be specified by the index of the last log
slot that modified the object. The "FINISH" command is annotated with
the complete transactional readset by UpdateHelper()
when called
with a transaction "FINISH". The readset specifies each object read
(as signaled by calls to QueryHelper
), and the version seen by each
read. For example, "FINISH,2-3,5-6,5-9" says that objects "2" and "5"
were read during the transaction. The read of object 2 saw version 3,
while there were two distinct reads of object 5, seeing versions 6 and
9, respectively.
Transactions commit only if no read objects are subsequently modified before the transaction attempts to commit. For example, the above transaction would be aborted if object 2 is modified to version 14 by a remote transaction before the local transaction finishes. Read objects may be modified by the local transaction and re-read with different result without causing the transaction to abort.
These semantics are implemented in the query helper, which downloads the most recent shared log suffix and parses the log entries to determine the fate of any new transactions. Once a transaction's fate is determined, The "FINISH" is changed to either "COMMIT" or "ABORT" in the local copy of the log.
To summarize from the application point-of-view: transaction "starts"
and "finishes" are sent to the tango module via UpdateHelper()
, but
the app sees no other application details, and is not part of the
determination any transaction's fate. The app objects are affected
only by Apply()
calls. These calls are immediate for
non-transactional updates, but calls for transactional updates are
delayed until a transaction is known to have committed.
Testing.
- I will test by running two copies of
intCRDT.go
against "scriptInt.1" concurrently, multiple times. Each time the end result should be the same regardless of interleaving. - I will test the transaction KV store by running
kv.go
with "scriptKV.1" with one app. As the third READ is seen, I will start another instance ofkv.go
running "scriptKV.2", which should cause transaction 2 to abort. Transactions 1 and 3 will commit. - You should come up scripts, similar to the KV scripts, to test your log-based tree implementation. Details should be in your README.md file.
Random Details
- The tree should support mutations such as:
- add a child
- move a child
- delete a child Transactional semantics allow these to be combined atomically.
Submitting
Submit by pushing to your repository.
- DO update the
README.md
to reflect what works, and what does not. - Describe your tree implementation, and specify how I can recreate your demonstration.
- Upload a video
demo.mp4
that shows you demonstrating all of your functionality, as if this is the only thing I see. Note that I will look at your code, and attempt to duplicate some of the functionality shown in your video, but your video should be complete. If you are on a mac, please use the "HandBrake" app to remove some of the bloat (default configuration is fine). For example, my 629MB .mov turned into an 87MB .mp4 w/ no noticable degradation. Do NOT upload a.mov
.
Video Walkthrough
Bibliography
[1] Balakrishnan, Mahesh, et al. "Tango: Distributed data structures
over a shared log." Proceedings of the twenty-fourth ACM symposium
on operating systems principles. 2013.