From a3d5ae7014069d0cb07e2a80eb7e3eb43dde1ee0 Mon Sep 17 00:00:00 2001 From: "Peter J. Keleher" <keleher@cs.umd.edu> Date: Thu, 9 Dec 2021 07:55:59 -0500 Subject: [PATCH] auto --- notes/hat.md | 152 ++++++++++++++++++++++++++++++++++++++++++++++++++ notes/hat.md~ | 109 ++++++++++++++++++++++++++++++++++++ 2 files changed, 261 insertions(+) create mode 100644 notes/hat.md create mode 100644 notes/hat.md~ diff --git a/notes/hat.md b/notes/hat.md new file mode 100644 index 0000000..34d7413 --- /dev/null +++ b/notes/hat.md @@ -0,0 +1,152 @@ +# Access Anomalies + + +Correctness anomalies +- serializable isolation used to be considered perfect + - transactions could be run in parallel + - equiv to running one after another + - developer does not have to reason about concurrency +- distributed and replicated DBs introduced new problems + - anomalies in serializable DBs arose that would not happen on a single machine + - 1SR “one-copy serializability†+ - any read will return the most recent write to that data item, regardless of which replicas are read or written + - doesn’t fix all problems + - +The Anomolies + +the immortal write - + + + +- tempting to leverage time travel to create an immortal blind write, which enables strightforward conflict resolution without violating the serializability guarantee + +the stale read + + + +- in single-server system little incentive to read older versions +- in distributed system, most recent version costs latency + +the causal reverse + + + +- serialization order doesn’t respect potential causality of non-conflicting writes (to different data, for example) + +SumUp + +- all can occur in 1SR +- none occur w/ strict serializability, + + + + +# Highly Available + +``` + authors: Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica + title: "HAT, not CAP: Towards Highly Available Transactions" + where: HotOS 2013 +``` + +### Definitions, in the context of Partitions: + +- **high availability:** + - each user that can contact a non-failing server eventually receives + response, even in presence of arb long partitions +- **sticky availability:** + - whenever client accesses copy that reflects all prior operations, + eventually receives a response +- **transactional replica availability:** + - if T can contact at least one replica for each item +- **aborts:** + - internal + - external (due to system or operation impl) +"**dirty reads**" +- reading uncommitted +``` + T1: wx(1) wx(2) commit + T2: wx(3) + T3: rx(?) +``` +Read should not ever return '1', and shouldn't return '3' if T2 aborts + +"**dirty writes**" +A *dirty write* occurs when one transaction overwrites a value that has previously been +written by another still in-flight transaction. Why bad? Could violate consistency +guarantees. Assume invariant *x == y*: +``` + T1: wx(1) wy(1) + T2: wx(2) wy(2) +``` +Both preserve consistency in isolation, but not w/ this schedule and dirty writes. + + +## Isolation guarantees: + +"**read uncommitted**" (PL-1) +- writes to each obj totally ordered (prohibits dirty writes) +- writes *across* objects consistently ordered +- implement w/ per-trans time, *last-writer-wins* + +"**read committed**" (PL-2) +- no dirty writes, reads +- implement w/ buffers (though doesn't guarantee recency) + +"**repeatable read**" (cut (*snapshot*) isolation) +- item cut iso (multiple different values): buffer reads +- predicate cut iso (cut over "SELECT ..WHERE....") +- impl both w/ buffering + +---- +### Unachievable isolation levels +- snapshot isolation, + - read from consistent cut + - commit only if items from writeset not committed by another T since snapshot + - partition either delays or suffer lost updates +- cursor stability + - means DB holds lock on a row while accessing, and no other T can + access it during this time, repeatable read often means holding lock + on entire set of results + - violated if lost writes because of locks not reaching across partition. + - therefore not HAT (because can't prevent lost updates) +---- +### Unachievable properties +- preventing lost updates +``` + lost update (a==1) - + T1: Rx(100), Wx(100+20=120) + T2: Rx(100), Wx(100+30=130) +``` + Final value should be 150. Lost update would be 120 or 130. + W/ partition, T1 and T2 might not see each other, hence lost update. + + *Clearly impossible to prevent in dist environment.* + +- preventing write skew. Write Skew generalizes LU to multiple keys. Possible problem is +violation of consistency, such as "x == y"" +``` + T1: t = x; y = t + T2: t = y; x = t +``` +Can happen w/ snapshot isolation. + +- Serializability: + - optimistic requires global validation + - pessimistic requires global coord/locking + + + +- Katura not buying sticky avail (definitional) +- Nao - "really confusing" (yes) +- Patrick/Andrew - causal only w/ sticky (client caching breaks lots of guarantees) + + + +---- + +### HAT-compliant: + + + +---- diff --git a/notes/hat.md~ b/notes/hat.md~ new file mode 100644 index 0000000..e7766ef --- /dev/null +++ b/notes/hat.md~ @@ -0,0 +1,109 @@ +# Highly Available + +``` + authors: Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica + title: "HAT, not CAP: Towards Highly Available Transactions" + where: HotOS 2013 +``` + +### Definitions, in the context of Partitions: + +- **high availability:** + - each user that can contact a non-failing server eventually receives + response, even in presence of arb long partitions +- **sticky availability:** + - whenever client accesses copy that reflects all prior operations, + eventually receives a response +- **transactional replica availability:** + - if T can contact at least one replica for each item +- **aborts:** + - internal + - external (due to system or operation impl) +"**dirty reads**" +- reading uncommitted +``` + T1: wx(1) wx(2) commit + T2: wx(3) + T3: rx(?) +``` +Read should not ever return '1', and shouldn't return '3' if T2 aborts + +"**dirty writes**" +A *dirty write* occurs when one transaction overwrites a value that has previously been +written by another still in-flight transaction. Why bad? Could violate consistency +guarantees. Assume invariant *x == y*: +``` + T1: wx(1) wy(1) + T2: wx(2) wy(2) +``` +Both preserve consistency in isolation, but not w/ this schedule and dirty writes. + + +## Isolation guarantees: + +"**read uncommitted**" (PL-1) +- writes to each obj totally ordered (prohibits dirty writes) +- writes *across* objects consistently ordered +- implement w/ per-trans time, *last-writer-wins* + +"**read committed**" (PL-2) +- no dirty writes, reads +- implement w/ buffers (though doesn't guarantee recency) + +"**repeatable read**" (cut (*snapshot*) isolation) +- item cut iso (multiple different values): buffer reads +- predicate cut iso (cut over "SELECT ..WHERE....") +- impl both w/ buffering + +---- +### Unachievable isolation levels +- snapshot isolation, + - read from consistent cut + - commit only if items from writeset not committed by another T since snapshot + - partition either delays or suffer lost updates +- cursor stability + - means DB holds lock on a row while accessing, and no other T can + access it during this time, repeatable read often means holding lock + on entire set of results + - violated if lost writes because of locks not reaching across partition. + - therefore not HAT (because can't prevent lost updates) +---- +### Unachievable properties +- preventing lost updates +``` + lost update (a==1) - + T1: Rx(100), Wx(100+20=120) + T2: Rx(100), Wx(100+30=130) +``` + Final value should be 150. Lost update would be 120 or 130. + W/ partition, T1 and T2 might not see each other, hence lost update. + + *Clearly impossible to prevent in dist environment.* + +- preventing write skew. Write Skew generalizes LU to multiple keys. Possible problem is +violation of consistency, such as "x == y"" +``` + T1: t = x; y = t + T2: t = y; x = t +``` +Can happen w/ snapshot isolation. + +- Serializability: + - optimistic requires global validation + - pessimistic requires global coord/locking + + + +- Katura not buying sticky avail (definitional) +- Nao - "really confusing" (yes) +- Patrick/Andrew - causal only w/ sticky (client caching breaks lots of guarantees) + + + +---- + +### HAT-compliant: + + + +---- -- GitLab