Skip to content
Snippets Groups Projects
Commit 8fa76d44 authored by Peter J. Keleher's avatar Peter J. Keleher
Browse files

auto

parent d8e0680c
No related branches found
No related tags found
No related merge requests found
notes/causalReverse.png

118 KiB

# Highly Available
```
authors: Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica
title: "HAT, not CAP: Towards Highly Available Transactions"
where: HotOS 2013
```
### Definitions, in the context of Partitions:
- **high availability:**
- each user that can contact a non-failing server eventually receives
response, even in presence of arb long partitions
- **sticky availability:**
- whenever client accesses copy that reflects all prior operations,
eventually receives a response
- **transactional replica availability:**
- if T can contact at least one replica for each item
- **aborts:**
- internal
- external (due to system or operation impl)
"**dirty reads**"
- reading uncommitted
```
T1: wx(1) wx(2) commit
T2: wx(3)
T3: rx(?)
```
Read should not ever return '1', and shouldn't return '3' if T2 aborts
"**dirty writes**"
A *dirty write* occurs when one transaction overwrites a value that has previously been
written by another still in-flight transaction. Why bad? Could violate consistency
guarantees. Assume invariant *x == y*:
```
T1: wx(1) wy(1)
T2: wx(2) wy(2)
```
Both preserve consistency in isolation, but not w/ this schedule and dirty writes.
## Isolation guarantees:
"**read uncommitted**" (PL-1)
- writes to each obj totally ordered (prohibits dirty writes)
- writes *across* objects consistently ordered
- implement w/ per-trans time, *last-writer-wins*
"**read committed**" (PL-2)
- no dirty writes, reads
- implement w/ buffers (though doesn't guarantee recency)
"**repeatable read**" (cut (*snapshot*) isolation)
- item cut iso (multiple different values): buffer reads
- predicate cut iso (cut over "SELECT ..WHERE....")
- impl both w/ buffering
----
### Unachievable isolation levels
- snapshot isolation,
- read from consistent cut
- commit only if items from writeset not committed by another T since snapshot
- partition either delays or suffer lost updates
- cursor stability
- means DB holds lock on a row while accessing, and no other T can
access it during this time, repeatable read often means holding lock
on entire set of results
- violated if lost writes because of locks not reaching across partition.
- therefore not HAT (because can't prevent lost updates)
----
### Unachievable properties
- preventing lost updates
```
lost update (a==1) -
T1: Rx(100), Wx(100+20=120)
T2: Rx(100), Wx(100+30=130)
```
Final value should be 150. Lost update would be 120 or 130.
W/ partition, T1 and T2 might not see each other, hence lost update.
*Clearly impossible to prevent in dist environment.*
- preventing write skew. Write Skew generalizes LU to multiple keys. Possible problem is
violation of consistency, such as "x == y""
```
T1: t = x; y = t
T2: t = y; x = t
```
Can happen w/ snapshot isolation.
- Serializability:
- optimistic requires global validation
- pessimistic requires global coord/locking
- Katura not buying sticky avail (definitional)
- Nao - "really confusing" (yes)
- Patrick/Andrew - causal only w/ sticky (client caching breaks lots of guarantees)
----
### HAT-compliant:
![pic](hat.png)
----
# SALT
"offering atomicity and isolation at the same granularity is the very reason why ACID
transactions are ill-equipped .....(performance vs programmability)"
**Pareto Principle:** 80% of effects from 20% of causes
-splitting ACID transactions up very good for concurrency
- bad for isolation
- key issue is to provide isolation at a finer granularity, same atomicity
- nested transactions give subtrans' atomicity, isolate entire thing
*Allow isolation to be specified at smaller granularity than atomicity*
## Background
ANSI iso levels:
- read-uncommitted
- read-committed (assumed goal for paper)
- repeatable read
- serializable
Issues:
- dirty writes: overwrite uncomitted
- dirty reads: read from uncommitted
- non-repeatable reads
- phantom
## Big Things
- BASE transactions
- salt isolation
- clever names
## BASE transactions
- **alkaline** subtransactions
- no other transactions can see state of *uncommitted alkaline subtrans*
- **committed** alk subtrans state viewable by other BASE or alkaline transactions
- *not visible* to ACID until entire BASE commit.
- intended for *partition-local* ops
- **salt isolation** allows control of internal states visibility (among other BASE transactions)
- each alkaline sub has associated *exception*
- *BASE transactions look like ACID transactions to other ACID transactions*
- **accepted** once any alkaline trans commits,
- accepted implies commit of entire BASE
- i.e. all operations successfully executed *or bypassed because of some exception*
- aborted only if they encounter an error before the transaction is accepted (unlike ACID)
## Salt Isolation
*"If two operations in different transactions conflict, then the temporal dependency
that exists between the earlier and the later of these operations must extend to the
entire transaction"* (allows SALT to work w/ different isolation levels)
*Isolation:* Let Q be the set of operation types {*read*, *range-read*, *write*} and let L and S
be subsets of Q . Further, let *o<sub>1</sub>* in *txn<sub>1</sub>* and *o<sub>2</sub>* in *txn<sub>2</sub>*, be two operations, respectively
of type *T<sub>1</sub>* ∈ L and *T<sub>2</sub>* ∈ S , that access the same object in a conflicting
(i.e. non read-read) manner. **If *o<sub>1</sub>* completes before *o<sub>2</sub>* starts, then *txn<sub>1</sub>* must decide
before *o<sub>2</sub>* starts.**
<img src=saltSets.png width=500>
The Isolation property holds as long as (a) at least one of txn<sub>1</sub> and
txn<sub>2</sub> is an ACID transaction or (b) both txn<sub>1</sub> and txn<sub>2</sub> are alkaline
subtransactions.
So:
- ACID transactions isolated from all other
- Alkaline subtrans isolated from ACID and other alkaline
- BASE expose states at alkaline boundaries to other BASE
- Design
- locks (because high contention)
- type
- ACID - conflict with alkaline and saline
- alkaline - conflict w/ ACID and other alkaline
- saline: conflict with ACID locks (except for read/read) only
- lock duration
- *long term* (life of (sub-)trans, 2PL)
- *short-term* (just the op)
- acquire only an alkaline lock at operation start
- “downgrade” it to saline at end of subtransaction, hold until after the end of the BASE transaction.
- no multi-version concurrency
![sets](saltConcurrent.png)
## Indirect Dirty Reads
<p>
![fig4](saltFig4.png)
<p>
Fixed by:
- **Read-after-write across transactions** A BASE transaction *B<sub>r</sub>* that reads a value x, which has been written by another *BASE<sub>w</sub>* transaction, cannot release its saline lock on x until *B<sub>w</sub>* has released its own saline lock on x.
- **Write-after-read within a transaction** An operation *O<sub>w</sub>* that writes a value x cannot
release its saline lock on x until all previous read operations within the **same** BASE
transaction have released their saline locks on their respective objects.
These two ensure uncommitted writes keep locks until all prior read locks also released.
<p>
## Forward Logging
- after BASE knows it will commit (because a subsaline committed), log entire BASE
- prevents cascades that would have occurred because of saline visibility
### Banking, again
![fig1](saltAcidApp.png)
![banking](saltBanking.png)
## Performance
![saltPerf1](saltPerf1.png)
![saltPerf2](saltPerf2.png)
## Comments
//=====================================================================
# Correctness Anomalies Under Serializable Isolation
Background: serializability guarantees an execute (a schedule)
equivalent to a serial schedule, but this serial schedule does not
have to be the ordering that actually occurred in real time (wall
clock time).
Background: *one-copy serilizability* (1SR):
```
...either ... will be ordered first in the equivalent serial order. Whichever transaction is second --- when it reads the balance --- it must read the value written by the first transaction
```
### The Immortal write
<img src=immortalWrite.png width=500>
### The Stale Read
<img src=staleRead.png width=500>
### The Stale Read
<img src=staleRead.png width=500>
### The Causal Reverse
<img src=causalReverse.png width=500>
*Strict serializability* prevents the all of these.
notes/immortalWrite.png

96.2 KiB

......@@ -120,3 +120,33 @@ These two ensure uncommitted writes keep locks until all prior read locks also r
## Comments
//=====================================================================
# Correctness Anomalies Under Serializable Isolation
Background: serializability guarantees an execute (a schedule)
equivalent to a serial schedule, but this serial schedule does not
have to be the ordering that actually occurred in real time (wall
clock time).
Background: *one-copy serilizability* (1SR):
```
...either ... will be ordered first in the equivalent serial order. Whichever transaction is second --- when it reads the balance --- it must read the value written by the first transaction
```
### The Immortal write
<img src=immortalWrite.png width=500>
### The Stale Read
<img src=staleRead.png width=500>
### The Stale Read
<img src=staleRead.png width=500>
### The Causal Reverse
<img src=causalReverse.png width=500>
*Strict serializability* prevents the all of these.
# SALT
"offering atomicity and isolation at the same granularity is the very reason why ACID
transactions are ill-equipped .....(performance vs programmability)"
**Pareto Principle:** 80% of effects from 20% of causes
-splitting ACID transactions up very good for concurrency
- bad for isolation
- key issue is to provide isolation at a finer granularity, same atomicity
- nested transactions give subtrans' atomicity, isolate entire thing
## Background
ANSI iso levels:
- read-uncommitted
- read-committed (assumed goal for paper)
- repeatable read
- serializable
Issues:
- dirty writes: overwrite uncomitted
- dirty reads: read from uncommitted
- non-repeatable reads
- phantom
## Big Things
- BASE transactions
- salt isolation
- clever names
## BASE transactions
- **alkaline** subtransactions
- no other transactions can see state of *uncommitted alkaline subtrans*
- **committed** alk subtrans state viewable by other BASE or alkaline transactions
- *not visible* (to ACID?) until entire BASE commit.
- **salt isolation** allows control of internal states visibility (among other BASE transactions)
- each alkaline sub has associated *exception*
- *BASE transactions look like ACID transactions to other ACID transactions*
- **accepted** once any alkaline trans commits,
- accepted implies commit of entire BASE
- i.e. all operations successfully executed *or bypassed because of some exception*
- aborted only if they encounter an error before the transaction is accepted (unlike ACID)
## Isolation
*"If two operations in different transactions conflict, then the temporal dependency
that exists between the earlier and the later of these operations must extend to the
entire transaction"* (allows SALT to work w/ different isolation levels)
*Isolation:* Let Q be the set of operation types {*read*, *range-read*, *write*} and let L and S
be subsets of Q . Further, let $o_1$ in $txn_1$ and $o_2$ in $txn_2$, be two operations, respectively
of type $T_1$ ∈ L and $T_2$ ∈ S , that access the same object in a conflicting
(i.e. non read-read) manner. **If $o_1$ completes before $o_2$ starts, then $txn_1$ must decide
before $o_2$ starts.**
Isolation property holds if at least one is ACID, or both are alkaline.
![sets](saltSets.png)
## Salt Isolation
**Salt Isolation**:
The Isolation property holds as long as (a) at least one of txn<sub>1</sub> and
txn<sub>2</sub> is an ACID transaction or (b) both txn<sub>1</sub> and txn<sub>2</sub> are alkaline
subtransactions.
So:
- ACID transactions isolated from all other
- Alkaline subtrans isolated from ACID and other alkaline
- BASE expose states at alkaline boundaries to other BASE
- Design
- locks (because high contention)
- type
- ACID - conflict with alkaline and saline
- alkaline - conflict w/ ACID and other alkaline
- saline: conflict with ACID locks (except for read/read) only
- lock duration
- *long term* (life of trans, 2PL)
- *short-term* (just the op)
- acquire only an alkaline lock at operation start
- “downgrade” it to saline at end of subtransaction, hold until after the end of the BASE transaction.
- no multi-version concurrency
![sets](saltConcurrent.png)
## Indirect Dirty Reads
<p>
![fig4](saltFig4.png)
<p>
Fixed by:
- **Read-after-write across transactions** A BASE transaction B<sub>r</sub> that reads a value x, which has been written by another BASE<sub>w</sub> transaction, cannot release its saline lock on x until B<sub>w</sub> has released its own saline lock on x.
- **Write-after-read within a transaction** An operation O<sub>w</sub> that writes a value x cannot
release its saline lock on x until all previous read operations within the **same** BASE
transaction have released their saline locks on their respective objects.
These two ensure uncommitted writes keep locks until all prior read locks also released.
<p>
## Forward Logging
- after BASE knows it will commit (because a subsaline committed), log entire BASE
- prevents cascades that would have occurred because of saline visibility
### Banking, again
![fig1](saltAcidApp.png)
![banking](saltBanking.png)
## Performance
![saltPerf1](saltPerf1.png)
![saltPerf2](saltPerf2.png)
## Comments
notes/staleRead.png

153 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment