# Finding global properties

*Consistent Snapshots*

Assume:
- piecewise deterministic
- reliable, ordered, uni-directional communication channels
- no other comm

To get snapshot (atomically do):
1. take ckpt at one process, 
2. send token on all outgoing edges (next msg)


At receipt of token, if haven't already seen it (atomically do):
1. take ckpt
1. send token on all outgoing edges 

Want:
- consistent checkpoints
- channel contents.

![snapshots](snapshots.png)

Consistent state is:
- A, B, C when they received the snapshot command
- m_2

Reconstructing this state:
- might not ever have happened
- **is** equiv to state that happened 

# Another way: Logical Time for dist systems.

notions:
- wall clock (lots of problems)
- logical time

## How should causality work?

(motivation)
- data init 0
- all values unique


```
      P1     P2
     w(x)1
     w(y)1
            r(y)1
            r(x)?
```

```
    w(x)1 -> r(x)1
```
"session semantics"

Also, what about:
```
      P1     P2
     w(x)1   w(y)1
     r(y)0   r(x)0
```

w(x)1 -> r(y)0 -> w(y)1 -> r(x)0 -> w(x)1

oops

## Happens-Before

1. program order ('->')
1. **send** msg before **receipt**
1. transitive closure

If  time only moves forward:
- '->' has no cycles
- *partial* order

if `!(e -> e')` and  `!(e' -> e)`
- `e,e'` **concurrent**

## Example for Logical Time

### Assume:
- **fully-replicated** key-value (KV) store
- unicast messages (assume they are *writes* to the kv)
  - reliable, ordered
- comm *only* through msg-passing.

### How to achieve causality?
- events of one proc linearly ordered
- send (write) *happens-before* receive (read)

### **Lamports scalar clocks**:
1. internal event or send event at Pi, `Ci := Ci + d`
2. each msg carries send timestamp
3. at Pj rcv w/ time t, clock `Cj := max(Cj,t) + d (usually 1)`

(many choices initial 0, increment at send, receive set)

```
    P1 e1       />  e5    \       e6
    P2   e2    /     e3    \
    P3           e4         \>  e7
```

Implications:
- events at diff procs can have same timestamp

What if we need uniqueness?
(tie-break of proc #)

What do we lose w/ scalar clock?
- don't know if `e -> e'` even if C(e) less

---
How to fix?

## Vector time (vector clocks) 
- each proc knows something about what other procs have seen
- Each proc Pi has clock Ci that is vector of len n (# procs).

**Vector clock:**
- initialized as zero vector
- increment local at event
- at send:
  - increment local
  - send w/ vector
- at receipt
  - increment local
  - `Ci = piecewise-max(Ci, t)`

![vector clocks](vectorclock.png)

So Ci[i] shows how many events have occurred in i. 

What do these relations mean w/ vectors (**u**, **v**):
-  `u <= v` iff forall i: `(u[i] <= v[i])`
-  u < v iff `(u <= v)` && exists i s.t. `u[i] < v[i]`
-  u || v iff `!(u < v) && !(v < u)`

What do we get?
- can now tell if two points causually related.

Properties:
- `u(a) < v(b)` then `a -> b`
- antisymmetry:  if `u < b `, then `!(v < u)`
- transitivity
- if  `u(a) < v(b)`, then  `wall(a) < wall(b)`

---------

### What do we do w/ vector time?

- finding "earliest" event
- session semantics
- resolve deadlocks
- consistent checkpoints
  - ensure no msgs received but not sent, no cycles
- observer getting notifications from different processes might want order
- debugging
  - can show that some event can not have caused another
  - reduce information necessary to replay
- detect race conditions
  - if two procs interact outside msgs, and interaction concurrent, it's
    a race
- measuring "degree of *possible* parallelism"

---

## Matrix clocks

Know what others know of others

(pic)

    P1                1         2>  3\
    P2       1   2        3>    4/     \
    P3  1:"sky is blue"  2/            3>
    
                                            3 4 3
                                            2 4 n
                                            2 4 3

Updates to local Mi:
- Pi's local event, including a send:  Mi[i][i]++
- On receive of matrix Mj:
  - increment Mi[i][i]++
  - forall k, Mi[i][k] = max(Mi[i][k], Mi[j][k])

lower bound on what other hosts know, and is useful
- checkpointing
- garbage collection


Performance
- scalar cheap
- vector, matrix mostly impractical, but:
  - incremental (tuples)


---

