Skip to content
Snippets Groups Projects
user avatar
keleher authored
d1de5a54
History

Project 5: Transactional Support From a Log

v1.0

Overall Project Description

You will be implementing a fully-replicated transactional key-value store. This implies:

  • reads and writes are only valid if in a committed transaction
  • clients parse all new log entries, applying committed transactions locally.

We will be using the log as a dumb consistent ledger. To this end we need to (1) create a way to commit arbitrary strings in the log, and (2) expose the shared log to clients.

We are going to handle both of these through a new method in the Client service: Cmd. Cmd takes identity, a string command, and a seen value that specifies the prefix of the shared log seen by the client previously.

This function:

  • commits the string onto the shared log,
  • synchronously waits until it has been committed, and then
  • returns everything in the log from slot until the slot containing the new command.

The client can use this to create local copy of the log that it will use to implement and interpret gets, puts, and transactions.

Cmd and associated type definitions have already been added to client.proto.

The Software Stack

The straightforward way to integrate transactions is to build them directly into the log implementation. Primarily, this allows dependencies between commits and transactional writes.

However, a shared log is a powerful abstraction that can be used to implement all sorts of functionality even in it's cleanest, most generic form. We are therefore going to layer the transactional functionality on top of the generic string interface described above.

Our simplified system is going to implement both a key-value store and transactions by layering them over the simple string interface supported by the Cmd interface described above. The client will construct a local representation of the log by retrieving and piecing together segments of the shared log.

Once the local log is created, the client can process operations, in order, to interpret transaction commands, determine whether they commit, and update a local copy of a key-value store that reflects committed transactions.

Unfortunately, Epaxos does not support a totally ordered log, instead it has a 2-D log and only supports a partial order over those entries. We are going to fix this in order to support the strict serializability used in Tango. We are going to address this through two mechanisms:

  • Create per-replica total orderings by adding something like inst.Deps[inst.Rep] = inst.Slot - 1 (aka "Chuck's Hack") before creating edges in the execution graph. This ensures that all instances of a single replica execute in order.
  • Send all client operations to replica 0. This is essentially dumbing down Epaxos to make it execute more like MultiPaxos.

Aside from Chuck's Hack, replica.go should only be modified to deal with the new Cmd request. This means ensure that the command will be sent through the BigChan and processed like gets and puts, and ensure that it will be executed. Execution of a Cmd means only that the log segment delimited is added to ClientReply.

The Log

The log will be a totally ordered list of strings, each in one of the following three formats:

        string = <client id>,<transaction id>,w,<key>,<value>
        string = <client id>,<transaction id>,r,<key>
        string = <client id>,<transaction id>,commit
        string = <client id>,<transaction id>,abort

A "client id" is something you make up in your input strings. For instance, you can be running two instances of transactionalClient.go at the same time (both talking to replica 0). The strings you type into each of the clients should have different client ids. The first two record formats define reads and writes in the log, while the last is a commit record.

Part 1: Serializable Transactions

Our transaction substrate is modeled on that in Tango. Update transactions are optimistic, and commit only if isolation is preserved for the entire transaction execution. Said another way:

   Definition 1:    Ti commits iff for each read r in Ti, the value read by r is not stale by the time Ti's commit record is encountered.

Transactions that are observed to commit have their writes reflected in the clients local key-value store.

Write-only Transactions

Write-only transactions are those with no reads. They always commit.

Read-only Transactions

Read-only transactions append only read commands to the shared log; there is no commit record. Read transactions commit using the criteria in Definition 1. The transaction's span in the log is defined as the log segment from the transaction's first read until it's last.

Reads from read-only transactions are committed to the log with transaction 0, which is ignored by other replicas.

Part 2: Snapshot Isolation

Your decision as to how to implement this, but it need not involve any changes to the log abstraction in replica.go

Use of the transactionalClient

transactionalClient.go differs from client.go in that it reads commands interactively. Once started, it reads commands (arbitrary strings) from STDIN and sends them to replica 0. The sole exception is:

  • pause <n> is interpreted as a command to pause for n seconds.

All other strings are sent to the replica. At end-of-text (CTRL-D), transactionalClient prints the contents of the K/V store and committed transactions.

Testing

Strict Serializability

      R1            R2
    1,1,r,A
    1,1,w,B,nice
                        2,1,r,A
                        2,1,w,C,foo
                        2,2,r,A
                        2,2,r,B
                        2,2,commit
                        2,1,commit

Your client should print:

	trans 2.2 abort
	trans 2.1 commit

Submitting

Submit by uploading tarball cd /vagrant/go/src/818f19/p5; tar cvfz DIRID.tgz DIRID, w/ your directory ID substituted in for "DIRID".

Upload here on ELMS.