In this talk, I overview Trill, describe two projects that expand Trill's functionality, and describe Quill, a new multi-node offline analytics system I have been working on at MSR.
32. shards
• querying
• data movement
• keying
Operation Description
Query Applies unmodified query on each
(keyed) shard
Broadcast Duplicate each shard’s contents on
all shards
Multicast Copy tuples from each input shard
to zero or more specific result
shards
ReShard Load balance across shards
ReDistribute Move tuples so that same key
resides in same result shard
ReKey Changes key associated with each
row in each shard
…
…
…
…
Badrish Chandramouli @ DEBS 2016
34. e => e.Count()
Flat re-
distribute
e => e.Count()
e => e.Sum()
Badrish Chandramouli @ DEBS 2016
35. e => e.Count()
[ReDist]
Union
[ReDist]
Union
[ReKey] [ReKey]
AGG AGG
[ReDist]
Union
[ReDist]
Union
[ReKey] [ReKey]
[ReDist]
Union
[ReDist]
Union
AGG AGG
[ReDist]
Union
[ReDist]
Union
AGG AGG
AGG AGG
e => e.Sum()
Badrish Chandramouli @ DEBS 2016
36. (l,r) => l.Join(r, …)
(l,r) => l.Join(r, …)
Flat re-
distribute
Flat
broadcast
No data
movement
Badrish Chandramouli @ DEBS 2016