# Access Anomalies


Correctness anomalies
- serializable isolation used to be considered perfect
    - transactions could be run in parallel
    - equiv to running one after another
    - developer does not have to reason about concurrency
- distributed and replicated DBs introduced new problems
    - anomalies in serializable DBs arose that would not happen on a single machine
    - 1SR “one-copy serializability”
        - any read will return the most recent write to that data item, regardless of which replicas are read or written
        - doesn’t fix all problems
    
## The Anomalies

*the immortal write* -

![](https://paper-attachments.dropbox.com/s_9C40D1E627431C359DFC050423F865DB110CEF11D9FA101F15E75C20CDD3C4D3_1639005186338_image.png)

- tempting to leverage time travel to create an immortal blind write, which enables strightforward conflict resolution without violating the serializability guarantee

*the stale read* -

![](https://paper-attachments.dropbox.com/s_9C40D1E627431C359DFC050423F865DB110CEF11D9FA101F15E75C20CDD3C4D3_1639005248966_image.png)

- in single-server system little incentive to read older versions
- in distributed system, most recent version costs latency

*the causal reverse* -

![](https://paper-attachments.dropbox.com/s_9C40D1E627431C359DFC050423F865DB110CEF11D9FA101F15E75C20CDD3C4D3_1639023882254_image.png)

- serialization order doesn’t respect potential causality of non-conflicting writes (to different data, for example)

### SumUp

- all can occur in 1SR
- none occur w/ strict serializability,




# Highly Available
```
 authors: Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica
    title: "HAT, not CAP: Towards Highly Available Transactions"
    where: HotOS 2013
```

### Definitions

- **high availability:**
  - each user that can contact a non-failing server eventually receives
    response, even in presence of arb long partitions
- **sticky availability:**
  - whenever client accesses copy that reflects all prior operations,
    eventually receives a response
- **transactional replica availability:**
  - if T can contact at least one replica for each item
- **aborts:**
    - internal
    - external (due to system or operation impl)
"**dirty reads**"
- reading uncommitted
```
     T1: wx(1) wx(2) commit
     T2: wx(3)
     T3: rx(?)
```
Read should not ever return '1', and shouldn't return '3' if T2 aborts

- "**dirty writes**"
A *dirty write* occurs when one transaction overwrites a value that has previously been
written by another still in-flight transaction. Why bad? Could violate consistency
guarantees. Assume invariant *x == y*:
```
     T1: wx(1)            wy(1)
     T2:      wx(2) wy(2)
```
Both preserve consistency in isolation, but not w/ this schedule and dirty writes.	 


## Isolation guarantee definitions:

"**read uncommitted**" (PL-1)
- writes to each obj totally ordered (prohibits dirty writes)
- writes *across* objects consistently ordered
- implement w/ per-trans time, *last-writer-wins*

"**read committed**" (PL-2)
- no dirty writes, reads
- implement w/ buffers (though doesn't guarantee recency)

"**repeatable read**"
- **cut** (*snapshot*) **isolation**
  - single piece of data
  - this is the usual meaning of RR
  - implement by buffering read values
- **item cut isolation** 
  - multiple different values
  - implement by buffering reads
- **predicate cut isolation*
  - cut over "SELECT ..WHERE....")  (phantom anomalies)
  - implement by caching entire logical ranges

----
### Unachievable isolation levels with partitions:
- snapshot isolation,
  - read from consistent cut
  - commit only if items from writeset not committed by another T since snapshot
  - partition either delays or suffer lost updates
- cursor stability
  - means DB holds lock on a row while accessing, and no other T can
    access it during this time, repeatable read often means holding lock
    on entire set of results
  - violated if lost writes because of locks not reaching across partition.
  - therefore not HAT (because can't prevent lost updates)
----
### Unachievable properties
- preventing lost updates
```
     lost update (a==1) -
     T1: Rx(100), Wx(100+20=120)
     T2: Rx(100), Wx(100+30=130)
```
  Final value should be 150. Lost update would be 120 or 130.
  W/ partition, T1 and T2 might not see each other, hence lost update.

  *Clearly impossible to prevent in dist environment.*
  
- preventing write skew.  Write Skew generalizes LU to multiple keys. Possible problem is
violation of consistency, such as "x == y""
```
     T1: t = x;  y = t
     T2: t = y;  x = t
```
Can happen w/ snapshot isolation even w/o partitions.

- Serializability:
  - optimistic requires global validation
  - pessimistic requires global coord/locking



- Katura not buying sticky avail (definitional)
- Nao -  "really confusing"  (yes)
- Patrick/Andrew - causal only w/ sticky (client caching breaks lots of guarantees)



----

### HAT-compliant:

![pic](hat1.png)

### Partial ordering
![pic](hat2.png)

- Blue is sticky available
- Red is unavailable
----
