# Existential consistency
![](exisArch.png)
Interesting:
- "local" consistency means composable.
- per-object SC
- **single-master** per shard
- TAO is data-store for social graphs
- conservative anomoly detection (because expand invocation/response times by 35 ms)
- good results due to:
  - low write frequency
  - per-object read locality
- often *undercount* anomolies because ignore clock skew

![](exisCons.png)
![](exisAnom.png)

Interesting:
- How much is due to session consistency?  (answered by phi)
- 99.8% reads, so consistency not as important....
  
# Warm Blobs

**Motivation:** decrease effective replication factor.
- Approach is basically to offload older ("cooler") objects to erasure-coded storage

## Correlations:
![](blobAge.png)
- blob age vs temperature (older is cooler)
- age and delete rate (older are deleted less frequently)


![](blobSystem.png)

## Structure
Blobs aggregated into append-only *volumes*, w/ max size of 100GB.
- data   (data, metadata)
- index  (snapshot of internal lookup structure)
- **journal** (tracks deletes)  (haystack only)

![](blobTiers.png)

![](blobCell.png)
Details:
- Cell is 14 racks of 15 hosts, each with 120TB of data.
  - *Reed-Solomon coding* (polynomials) across a **cell**
    - *(n, k)* code encodes n bits of data with *k* extra bits of parity, and can tolerate
      *k* failures, at an overall storage size of *n + k*. (10/4)
- region fault tolerance by doing geo-replicated XOR
- *backoff nodes* used for online reconstruction of blobs after failure
- *rebuilder nodes* reconstruct entire volumes

### Encryption:
- per-blob key
- deletes handled by deleting key (in database)
  - storage space never reclaimed

Questions:

- Do they have consistency concerns?  - greg
 - when the data is reconstructed in case of a quadruple failure, does it just reconstruct
  the buddy block from the companion blocks and parity blocks and return them to the
  consumer or does it store it back in the system for further use in future? (sankha)

