Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
cmsc818fall2023projects
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Peter Keleher
cmsc818fall2023projects
Commits
38089991
Commit
38089991
authored
1 year ago
by
Peter J. Keleher
Browse files
Options
Downloads
Patches
Plain Diff
auto
parent
a1ceb7a2
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
notes/cb.md
+354
-0
354 additions, 0 deletions
notes/cb.md
with
354 additions
and
0 deletions
notes/cb.md
0 → 100644
+
354
−
0
View file @
38089991
# Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
## Contributions:
-
causal+
-
linearizable within cluster
-
replicated keys using
`chain replication`
-
causal across clusters
-
bolt-on-ish
-
use of explicit dependencies (before bolt-on)
-
get (read) transactions

## COPS
-
individual accesses
-
explicit tracking of dependences
-
*nearest*
-
contexts (threads, or explicit)
-
Lamport Clocks
-
version is CLOCK.nodeID
-
convergent, not necessarily eventual

-
"context" implemented incrementally as set of nearest neighbors
-
read:
-
return value added to context
-
write:
-
context added to write
-
context zero'd out
-
context set to new write
-
propagated with writes to remote clusters
-
write installed at new cluster by
-
`depCheck()`
s to owner of each write in "nearest"
-
`depCheck()`
blocks until true
-
`depCheck()`
re-issued on time-out
## COPS-GT
*causal not always enough*
### Straightforward problems
Alice
```
s0: ACL == "world"
p0:"all is good"
s1: change ACL to exclude Boss
p2: "hate my job"
```
Assume reads implemented by ACL check followed by read
Boss:
-
check security: see s0, which includes Boss
-
read returns p2
-
Alice is fired
*TOCTOU*
vulnerability (time-of-check-to-time-of-use)
### Another implementation
Might consider re-implementing library to issue GETs in reverse order, checking "nearest" as you go
-
works for above example (p2 depends on s1)
But if Alice adds a few more actions:
```
p3: delete last post
s2: add Boss back to ACL
```
...the reverse order doesn't help
-
could return p2, s2 (both possibly latest version, causally consistent individually when issued)
### A causally-consistent set of reads
s0, p0: "world" / "all is good"
s1, p0: "no-boss" / "all is good"
s1, p2: "no-boss" / "hate job"
s1, p3: "no-boss" / "all is good"
s2, p3: "world" / "all is good"
## GT implementation
-
"all dependences"
-
really means one-per-key that have been locally modified
-
"nearest" still big win over this because there might be
*many*
keys
-
individual writes not coordinated
-
by keeping
**all**
dependences, not just nearest
-
clearly need GC

## Notes
-
Kaitlyn - just measured against themselves
-
Elliot - ALPS
-
Huayang - CAP, what does paxos give up?
-
Nikhil - when not convergent?
# Bolt-On Consistency
### Fun scenario
-
Sally posts "I think Billy is missing!"
-
Seconds later, Billy calls mom, says "Safe!", Sally posts "False
alarm!"
-
James posts "What a relief!"
Henry sees JUST original post and James', is confused.
**Shim layer to support causal consistency over weaker (eventually-consistent) systems.**

### Design and Implementation
Write path:
-
Client write performed in
**local store**
, send to ECDS w/ metadata.
Read path (causal):
-
Client read satistfied by local store
Read path (pessimistic):
-
synchronous checking on data, and dependencies, in the ECDS, rather than relying
on async
*resolve*
process to do this for us.
### Motivation
-
Several NoSQL stores have added stronger consistency, implying need:
[
Vogels: Choosing consistency.
](
http://www.allthingsdistributed.com/2010/02/
)
.
-
HAT work shows causal even in the face of partitions
-
allows purely local reads (so does eventual!)
-
no production-ready store provides it
### Challenges
-
handling overwritten histories:
-
eventually-consistent stores often choose
a single winning write, existing cc algs fail because
the cc layer might not see all writes
-
maintaining availability
### Innovations
-
alg that buffers writes and ensures writes satisfy
*causal cut*
.
### Causality
Differentiate between:
-
**potential causality**
: what we usually think of, and
*
which they do
not consider further
*
, and
-
**explicit causality**
, i.e. tags provided by app
### Bolt-on *separation of concerns*
-
safety properties (causal
*shim*
) from
-
liveness (eventually consistent data store (ECDS))
### Bolt-on
Operations:
-
**get(key)**
-
**put(key, value, after)**
, where
*after*
is any set of previously
returned writes (identifiers)
### Consistency cases
Let's assume x1 -> y1 (x1 existed in store when/where y1 was
created). So any replica reading y1 should have (or have had) x1 as
well. Assume x1, y1 created by different processes, and discuss
what happens at P when it wants to read
*y*
:
1.
P has not received any writes
2.
P has received x1, but not y1
3.
P has received y1, but not x1
4.
P has received both
<p>
<p>
<p>
<p>
<p>
<p>
<p>
<p>
Answers:
1.
P has not received and writes, returns y=null
2.
P has received x1, but not y1. x1 put in local store, returns y=null.
3.
P has received y1, but not x1. P buffers y1, returns y=null.
4.
P has received both, both put in local store, returns y=y1.
The fun case is 3. We could just go ahead and let P read
*y1*
, but
we'd have to remember that, and stall the process if it subsequently
reads
*x*
.
-
this means we give up availability (consider network partition)
### Overwritten Histories
The following is the motivation to NOT just use the underlying store
to hold incoming writes directly.
Example (concurrent):
-
P1 writes x1 -> y1 -> z1
-
P2 writes y2
Assume shim stores immediate dependencies. Assume
*P*
has received
`z1`
,
and then
`y2`
.
*P*
knows did know that it needed
`y1`
, but doesn't know
about
`x1`
. The overwrite by
`y2`
loses the dependency on
`x1`
, and the
transitive dependency
`x1 -> z11`
is also lost.
### Causal Cuts
Causal cut (paraphrase)
: A set of key-versions, at most one per key, such that all
dependencies of any key are either part of the cut, or concurrent with the
cut's version of that key.
Example:

Some causal cuts:
-
w1, x1, y1
<p>
-
yes
-
x1, w2, y1
<p>
-
yes
-
w1, x1, y1, z1
<p>
-
yes
-
x1, z1
<p>
-
NO: need y1, w1 or w2
-
w2, x1, y1, z1
<p>
-
yes
-
w1
<p>
-
yes
-
w1, z1
<p>
-
NO: need x1, y1
-
w2
<p>
-
yes
-
w1, x1
<p>
<p>
-
yes
-
y1, z1
<p>
-
NO: need x1, w1 or w2
### Metadata
Excpected size of dependency set is number of unique keys int he
causal history.
### Errata
-
*Callback on covergence*
allows key-versions to be dropped from
histories once seen everywhere (GC).
-
store pruning: drop keys not needed any more from local store; might require
dropping other writes that depend on it
## Notes/comments
-
Nikhil: first read is null, lots of action no reads.
-
Anubhav - assume stickiness which can slow things down under load
-
Anubhav - long-range dependencies can add up
-
does the shim give eventual consistency?
-
Is it possible for one to make a system that will never converge, but still
allow me to lazily claim eventual consistency since it's impossible to prove
otherwise?
-
has it caught on?
-
surprised that faster than eventual
-
fate-sharing, how does rebooting client know what happened?
-
scale-out: assume clients scale w/ replicas (especially since co-located)
# Coordination Avoidance in Database Systems
When is coordination strictly necessary to maintain application-level
consistency? Ask programmers.
"Coordination can only be avoided if all local commit decisions are globally valid"
Factoids:
-
invariants can be expressed as part of schema DDL
-
replicas eventually consistent
Issues:
-
sometimes programmers are bad
-
black-and-white model: some communication might be cheap, some expensive
### Invariant confluence
Replica is
*i-valid*
iff invariant I is true.
*Globally I-valid system*
can execute a set of transactions T w/
coord freedom, trans avail, convergence iff T is I-confluent wrt I.
A system provides coordination-free execution for a set of
transactions T iff the progress of executing each
*t in T*
is only
dependent on the versions of the items t reads (i.e., t's replica
state).
## Example
**PRIMARY KEY/UNIQUE constraints**
. Assume userids should be
unique. {Stat:5} and {Mary:5} are both I-T-reachable, but {Stan:5,
Mary:5} is not I-valid. Therefore, uniqueness not I-confluent for
inserts of unique values.
**HOWEVER:**
*reads*
and
*deletes*
are both I-confluent under uniqueness.
**AND FURTHER:**
could add a function to generate unique IDs s.t. IDs from
distinct replicas doesn't overlap, and then uniqueness would be
I-confluent.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment