Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

Project 6: Supporting High-level Abstractions From a Shared Log

v1.0
Due Dec 10

Setup

Download files here.

Overview

This project requires you to build three conflict-free replicated data types on top of the shared, replicated log you built in P5. For all three types, writes are in the log, reads are not.

For example, with the intCRDT, an integer increment conflictfree-replicated-data-type (CRDT), each increment is written to the log. Reading the type consists of traversing the log and adding all the increments.

With the transactional key store, writes to the KV are in the log, reads are not. This is complicated by the transactional nature of the KV. Like w/ Tango, we expect transactions to support strict serializability. Writes from transactions that abort do not affect the KV.

Finally, in the tree type, you will create a shared tree that supports concurrent modifications via simple mutations. You build this entirely from scratch.

You will build your tango support in a new p6/tango module. Your applications will call the tango module API to access the shared log.

Building and Testing the Tango Interface

Our "tango" implementation is based relatively closely on the the Tango paper [1] we discussed in class. In particular, the paper defines a Tango runtime abstraction that provides update_helpers and query_helpers. The former adds a command to the shared log, while the latter brings a local object up-to-date with respect to the most recent local view of the log.

We will dispense w/ the object wrapper and define:

func TangoQueryHelper(obj TangoObject) ()
func TangoUpdateHelper(obj TangoObject, cmd string) string ()

Our shared log commands are strings, so the update helper merely adds a string to the log.

We also define a TangoObject:

type TangoObject interface {
	Oid() int64          // object ID
	Tid() int64          // transaction ID
	Apply(data string)
}

All three of your abstractions will conform to the TangoObject interface, allowing your tango system layer to be oblivious to the application semantics.

Details

Changes to the RPC definitions

  1. CommandRequest includes fields Oid and Tid.
  2. LogEntry also includes Oid and Tid fields.
  3. New RetrieveCommitted RPC to retrieve all committed log entries after a particular log slot.

The tango log

The tango module keeps a local copy of the shared log, updated whenever TangoQueryHelper() is called. Much of the time this log is therefore only a prefix of the full shared log.

To make this more concrete, consider the workflow of intCRDT, with implements reads and conflict-free increments (updates). Each intCRDT object consists only of it's object ID ("oid") and its state ("state"). An increment is written to the log via UpdateHelper, which you will implement in the tango module.

UpdateHelper packages the increment and the OID into a new pb.LogEntry, and sends to the local raft instantiation via the Command() RPC, which has been enhanced to allow both object and transaction IDs to be specified. intCRDT is not transactional, so the TID can be left blank (the "zero value" of an int is...0). Command() is synchronous, so it does not return until the increment has been committed.

Reads of intCRDTs are implemented by calling QueryHelper, parameterized by OID, which has the following tasks:

  • Sync the local copy of the log w/ the shared version via the new RetrieveCommitted() raft RPC. The request specifies the library's last local entry; the RPC returns everything after it.
  • Parse through each of the log entries, in order, calling TangoObject.Apply() for each entry updating the object w/ OID. The object's value does not have to be returned because the Apply() calls will have already updated the intCRDT.

The parser in intCRDT.go updates several objects, displaying the value of each at the end. The simple input script scriptInt.1 has two fields per line: the oid and the increment to be applied to that object. If multiple applications run the same script concurrently, the exact interleaving is non-deterministic, but the final value should remain unchanged if we run the same two copies multiple times.

Transactional semantics

The above is relatively straightforward, but transactional semantics require a bit more mechanism. First, every line in scriptKV.1 consists of an oid, a tid (transaction ID), and a command. The commands have the following implications:

  • "START" (transaction): Each client application, together with it's instantiation of the tango library, as assumed to be single-threaded. Only a single transaction is in progress from the client at any time. The Tango transactional implementation relies on maintaining readsets for the current transaction. Since only one local transaction can be active at a time, only one readset need be maintained. This readset is re-initialized at each transaction start.
  • "FINISH" (transaction): The app signals a commit request by issuing "FINISH" to the raft abstraction, annotated with the current readset.
  • "READ": calls QueryHelper(), and then returns the current value.
  • "SLEEP": sleeps an integer number of seconds.
  • All others are strings that should be copied verbatim to the oid as new values for the associate object.

All transactional fates are determined independently, but deterministically, by each tango library. Recall (from the paper), that an object version can be specified by the index of the last log slot that modified the object. The "FINISH" command is annotated with the complete transactional readset by UpdateHelper() when called with a transaction "FINISH". The readset specifies each object read (as signaled by calls to QueryHelper), and the version seen by each read. For example, "FINISH,2-3,5-6,5-9" says that objects "2" and "5" were read during the transaction. The read of object 2 saw version 3, while there were two distinct reads of object 5, seeing versions 6 and 9, respectively.

Transactions commit only if no read objects are subsequently modified before the transaction attempts to commit. For example, the above transaction would be aborted if object 2 is modified to version 14 by a remote transaction before the local transaction finishes. Read objects may be modified by the local transaction and re-read with different result without causing the transaction to abort.

These semantics are implemented in the query helper, which downloads the most recent shared log suffix and parses the log entries to determine the fate of any new transactions. Once a transaction's fate is determined, The "FINISH" is changed to either "COMMIT" or "ABORT" in the local copy of the log.

To summarize from the application point-of-view: transaction "starts" and "finishes" are sent to the tango module via UpdateHelper(), but the app sees no other application details, and is not part of the determination any transaction's fate. The app objects are affected only by Apply() calls. These calls are immediate for non-transactional updates, but calls for transactional updates are delayed until a transaction is known to have committed.

Testing.

  1. I will test by running two copies of intCRDT.go against "scriptInt.1" concurrently, multiple times. Each time the end result should be the same regardless of interleaving.
  2. I will test the transaction KV store by running kv.go with "scriptKV.1" with one app. As the third READ is seen, I will start another instance of kv.go running "scriptKV.2", which should cause transaction 2 to abort. Transactions 1 and 3 will commit.
  3. You should come up scripts, similar to the KV scripts, to test your log-based tree implementation. Details should be in your README.md file.

Random Details

  • The tree should support mutations such as:
    • add a child
    • move a child
    • delete a child Transactional semantics allow these to be combined atomically.

Submitting

Submit by pushing to your repository.

  • DO update the README.md to reflect what works, and what does not.
  • Describe your tree implementation, and specify how I can recreate your demonstration.
  • Upload a video demo.mp4 that shows you demonstrating all of your functionality, as if this is the only thing I see. Note that I will look at your code, and attempt to duplicate some of the functionality shown in your video, but your video should be complete. If you are on a mac, please use the "HandBrake" app to remove some of the bloat (default configuration is fine). For example, my 629MB .mov turned into an 87MB .mp4 w/ no noticable degradation. Do NOT upload a .mov.

Video Walkthrough

walkthrough!

Bibliography

[1] Balakrishnan, Mahesh, et al. "Tango: Distributed data structures
    over a shared log." Proceedings of the twenty-fourth ACM symposium
    on operating systems principles. 2013.