diff --git a/notes/fuzzy.md b/notes/fuzzy.md index 54c7a037596f9128353a201998d98e6c08d07959..d0a4b4c92f08bdf028fcc00c5fd9339a38dd7399 100644 --- a/notes/fuzzy.md +++ b/notes/fuzzy.md @@ -9,87 +9,101 @@ - expensive - passing mention that CORFU doesn't scale past 20 servers in two adjacent racks - impossible (consistency w/ partitions) +- single sequencer works well for small clusters, doesn't scale w/ network diameter - +## Approach +- two causes of partial ordering + - data sharding + - geo-replication -**color:** updates to a single shard, made of multiple chains +## linear scaling from: +- storing each color on a different replica set -> appends to single color in one phase +- serializable isolation for multi-color appends +- failure atomicity +- lazy cross-region sync -**chains:** totally ordered set of updates from a single region. Chains connected by causal dependencies. +Captured in the fuzzy log: -Each server has: -- latest copy of local chains -- stale prefixes of other chains +<img src=fuzzyLog.png width=500> -**node append:** -- node added to local chain of specified color -- w/ outgoing cross-links to *last node seen by client of each remote chain* (same color) -- can be atomically appended to multiple (local?) colors, **transaction across shards** +## Abstraction +- **color:** updates to a single shard, made of multiple chains +- **chain:** totally ordered set of updates from a single region, single shard. Chains connected by causal dependencies. +- chains within a color connected by causal cross-links -"toggle between strong and weak consistency during partitions by switching regions" +Each region has: +- latest copy of local chains +- possibly stale prefixes of other chains -linear scaling from: -- storing each color on a different replica set -> appends to single color in one phase -- serializable isolation for multi-color appends -- failure atomicity -- lazy cross-region sync +Application approach: +- apps partition data across shards +- weaken consistency by not using synchronous coordination on critical paths + +**Client API:** +<img src=fuzzyAPI.png width=500> - - `new_instance`: adding color to local -- `sync`: plays all local nodes not seen by app's previous sync, ordered by chain and cross-edges (causal) -- `append`: added to end of local totally-ordered chain, cross-edges to tail of each other chain (edges from a node define a snapshot) -- `trim`: GC nodes prior to snapshot +- `sync`: plays all local nodes not seen by client's previous sync, ordered by chain and cross-edges (causal) +- `append`: added to end of local totally-ordered chain, cross-edges to tail of other local chains +- `trim`: GC nodes prior to a snapshot - +**deets:** +- node added to local chain of specified color +- w/ outgoing cross-links to *last node seen by client of each remote chain* (same color) +- can be atomically appended to multiple (local?) colors (**transaction across shards**) ----- -## Implementation -- each color a single partition -- each partition replicated using *chain-replication* -- battery backed-up ram for servers -- local append to single color one phase (send it to the chain replication) -- local append to multiple colors requires two phases (Skeen's) - - logical clocks - - *phase 1:* send out multicast w/ proposed timestamp - - replies return local timestamps - - *phase 2:* send msg again w/ max of local or returned timestamps - - order at all destinations with new timestamp - ---- +<img src=fuzzyEvolution.png width=500> -*Any protocol that provides a total order consistent with a - linearizable order (i.e, if an update B starts in real time after - another update A completes, then B occurs after A in the total order) - is subject to unavailability during network partitions.* +## Semantics +- two `append` operations same color from different regions only + ordered if one already seen by client issuing the other +- never any conflicts when updates to same color by different regions merge + - only partial orders + - not clear how (if) we get eventual consistency +- operations in a single region are serializable + - linearizable iff operations to a single color (chain) + - not externalizable if multiple colors -Is this true? +"toggle between strong and weak consistency during partitions by switching regions" ??? + +---- + ## Apps -- LogMap +<img src=fuzzyMaps.png width=600> + +- LogMap (corfu) - single region, single color - `get` waits until a sync that started after sync issue, then accesses local state - `put` appends to log, then waits for `sync` to show it applied before returning + - poor scalability, performance, availability - ShardedMap - single region, *multiple* color - - says linear scaling with linearizable put/get ops + - linear scaling with linearizable put/get ops - why? scaling because can scale colors with num servers - AtomicLog - - multiple colors, which are serializable (because underlying ops are serializable) + - atomic multi-puts (no gets) + - multiple colors, which are serializable (because underlying ops + are serializable), not linearizability - TxMap - wants *strict serializability* (externalizable) - speculative intention *commit* appends - servers playing the intention node append *yes/no* for a specific color based on - presence of conflicting ops im log between the original appends and the speculative commits - - other servers only apply transaction if they see a *yes* for each color + presence of conflicting ops im log between the original appends + and the speculative commits (tango) + - other servers only apply transaction if they see a *yes* for every used color - CRDTMap - - CRDT: *conflict-free-replicated-data-type* + - CRDT: *convergent-free-replicated-data-type* + - *causal* consistency - single color, async replication of remote chains - - **convergence** through + - **convergence** through *observed-remove* sets + - not immediately clear how a CRDT set applies to convergence here - CAPMap - consistency - - strong when to partition + - strong when no partition - causal otherwise - routes put through proxy in a *primary* region, if avail. - RedBlueMap: @@ -97,21 +111,29 @@ Is this true? - blue ops not -## Multiple-color append -- to local chains -- operations within a single region are serializable -- appends in some serial order - - linearizable if to a single color - - not necessarily externalizable if to multiple colors - -Difference between LogMap and ShardedMap? - +## Implementation +- each color a single partition +- each partition replicated using *chain-replication* +- battery backed-up ram for servers +- local append to single color one phase (send it to the chain replication) +- local append to multiple colors requires two phases (Skeen's) + - logical clocks + - *phase 1:* send out multicast w/ proposed timestamp + - replies return local timestamps + - *phase 2:* send msg again w/ max of local or returned timestamps + - order at all destinations with new timestamp + - very reminiscent of paxos w/ prepare and accept phases +<img src=fuzzySkeen.png width=500> +--- -Regions vs machines? +*Any protocol that provides a total order consistent with a + linearizable order (i.e, if an update B starts in real time after + another update A completes, then B occurs after A in the total order) + is subject to unavailability during network partitions.* -Guarantees of durability? +### Durability? - when is an update "committed"? ## Comments, Questions -- (patrick) colors maybe difficult to use + diff --git a/notes/fuzzyAPI.png b/notes/fuzzyAPI.png index 3ab2c9ad63be61b59d6d3701519c60c5b95879b6..99ad17cc7fa0b2c8fb157ab835cdc5935b14dbb0 100644 Binary files a/notes/fuzzyAPI.png and b/notes/fuzzyAPI.png differ diff --git a/notes/fuzzyEvolution.png b/notes/fuzzyEvolution.png index d29fa28c438de0da4e39c10e0b40f977d74f925e..d8a9ea3582c9b59c5d3df826f13eb41970d314c7 100644 Binary files a/notes/fuzzyEvolution.png and b/notes/fuzzyEvolution.png differ diff --git a/notes/fuzzyLog.png b/notes/fuzzyLog.png index 79a07d240f10181b1d0394364c7bb8c95626d923..3a7d11442164b0dc653b9408fcb8b2284b577c31 100644 Binary files a/notes/fuzzyLog.png and b/notes/fuzzyLog.png differ diff --git a/notes/fuzzyMaps.png b/notes/fuzzyMaps.png new file mode 100644 index 0000000000000000000000000000000000000000..367d3ba18df4fa76753cf558b15024a6b5cba73e Binary files /dev/null and b/notes/fuzzyMaps.png differ diff --git a/notes/fuzzySkeen.png b/notes/fuzzySkeen.png index 2db2e0eb5c986a91a403814ef9b3e797fe41e7fd..f6fa1da2355be4184ee6735c6721e61374f1db0b 100644 Binary files a/notes/fuzzySkeen.png and b/notes/fuzzySkeen.png differ