Más contenido relacionado
Similar a Cmu-2011-09.pptx (20)
Cmu-2011-09.pptx
- 2. Outline
• MapR
system
overview
• Map-‐reduce
review
• MapR
architecture
• Performance
Results
• Map-‐reduce
on
MapR
• Architectural
implica0ons
• Search
indexing
/
deployment
• EM
algorithm
for
machine
learning
• …
and
more
…
10/11/11
©
MapR
Confiden0al
2
- 3. !"!#
Map-‐Reduce
!" @/-,9) !#
A.0B
Input
Output
!" @/-,9) !#
A.0B
Shuffle
$%&'()"*" +,&)!'%-(./%0) 12'!!3)"*4 536'-3) 8'(&'()"930)
10/11/11
"*# ©
MapR
Confiden0al
!'%-(./%0) "*: 3
"*7
- 4. BoQlenecks
and
Issues
• Read-‐only
files
• Many
copies
in
I/O
path
• Shuffle
based
on
HTTP
• Can’t
use
new
technologies
• Eats
file
descriptors
• Spills
go
to
local
file
space
• Bad
for
skewed
distribu0on
of
sizes
10/11/11
©
MapR
Confiden0al
4
- 5. MapR
Areas
of
Development
HBase
Map
Reduce
Ecosystem
Storage
Management
Services
10/11/11
©
MapR
Confiden0al
5
- 6. MapR
Improvements
• Faster
file
system
• Fewer
copies
• Mul0ple
NICS
• No
file
descriptor
or
page-‐buf
compe00on
• Faster
map-‐reduce
• Uses
distributed
file
system
• Direct
RPC
to
receiver
• Very
wide
merges
10/11/11
©
MapR
Confiden0al
6
- 7. MapR
Innova0ons
• Volumes
• Distributed
management
• Data
placement
• Read/write
random
access
file
system
• Allows
distributed
meta-‐data
• Improved
scaling
• Enables
NFS
access
• Applica0on-‐level
NIC
bonding
• Transac0onally
correct
snapshots
and
mirrors
10/11/11
©
MapR
Confiden0al
7
- 8. MapR's
Containers
Files/directories
are
sharded
into
blocks,
which
are
placed
into
mini
NNs
(containers
)
on
disks
l Each
container
contains
l Directories
&
files
l Data
blocks
l Replicated
on
servers
Containers
are
l No
need
to
manage
16-‐32
GB
segments
directly
of
disk,
placed
on
nodes
10/11/11
©
MapR
Confiden0al
8
- 9. MapR's
Containers
l Each
container
has
a
replica0on
chain
l Updates
are
transac0onal
l Failures
are
handled
by
rearranging
replica0on
10/11/11
©
MapR
Confiden0al
9
- 10. Container
loca0ons
and
replica0on
N1,
N2
N1
N3,
N2
N1,
N2
N1,
N3
N2
N3,
N2
CLDB
N3
Container
loca0on
database
(CLDB)
keeps
track
of
nodes
hos0ng
each
container
and
replica0on
chain
order
10/11/11
©
MapR
Confiden0al
10
- 11. MapR
Scaling
Containers
represent
16
-‐
32GB
of
data
l Each
can
hold
up
to
1
Billion
files
and
directories
l 100M
containers
=
~
2
Exabytes
(a
very
large
cluster)
250
bytes
DRAM
to
cache
a
container
l 25GB
to
cache
all
containers
for
2EB
cluster
But
not
necessary,
can
page
to
disk
l Typical
large
10PB
cluster
needs
2GB
Container-‐reports
are
100x
-‐
1000x
<
HDFS
block-‐reports
l Serve
100x
more
data-‐nodes
l Increase
container
size
to
64G
to
serve
4EB
cluster
l Map/reduce
not
affected
10/11/11
©
MapR
Confiden0al
11
- 12. MapR's
Streaming
Performance
2250 2250
11
x
7200rpm
SATA
11
x
15Krpm
SAS
2000 2000
1750 1750
1500 1500
1250 1250 Hardware
MapR
1000 1000
MB
Hadoop
750 750
per
sec
500 500
250 250
0 0
Read Write Read Write
Higher
is
be;er
Tests:
i.
16
streams
x
120GB
ii.
2000
streams
x
1GB
10/11/11
©
MapR
Confiden0al
12
- 13. Terasort
on
MapR
10+1
nodes:
8
core,
24GB
DRAM,
11
x
1TB
SATA
7200
rpm
60 300
50 250
40 200
Elapsed
30 150
MapR
=me
Hadoop
(mins)
20 100
10 50
0 0
1.0
TB 3.5
TB
Lower
is
be;er
10/11/11
©
MapR
Confiden0al
13
- 14. HBase
on
MapR
YCSB
Random
Read
with
1
billion
1K
records
10+1
node
cluster:
8
core,
24GB
DRAM,
11
x
1TB
7200
RPM
25000
20000
Records
15000
per
MapR
second
10000
Apache
5000
0
Zipfian
Uniform
Higher
is
be;er
10/11/11
©
MapR
Confiden0al
14
- 15. Small
Files
(Apache
Hadoop,
10
nodes)
Out
of
box
Op:
-‐
create
file
Rate (files/sec)
-‐
write
100
bytes
Tuned
-‐
close
Notes:
-‐
NN
not
replicated
-‐
NN
uses
20G
DRAM
-‐
DN
uses
2G
DRAM
#
of
files
(m)
10/11/11
©
MapR
Confiden0al
15
- 16. MUCH
faster
for
some
opera0ons
Same
10
nodes
…
Create
Rate
#
of
files
(millions)
10/11/11
©
MapR
Confiden0al
16
- 17. What
MapR
is
not
• Volumes
!=
federa0on
• MapR
supports
>
10,000
volumes
all
with
independent
placement
and
defaults
• Volumes
support
snapshots
and
mirroring
• NFS
!=
FUSE
• Checksum
and
compress
at
gateway
• IP
fail-‐over
• Read/write/update
seman0cs
at
full
speed
• MapR
!=
maprfs
10/11/11
©
MapR
Confiden0al
17
- 19. Alterna0ve
NFS
moun0ng
models
• Export
to
the
world
• NFS
gateway
runs
on
selected
gateway
hosts
• Local
server
• NFS
gateway
runs
on
local
host
• Enables
local
compression
and
check
summing
• Export
to
self
• NFS
gateway
runs
on
all
data
nodes,
mounted
from
localhost
10/11/11
©
MapR
Confiden0al
19
- 20. Export
to
the
world
NFS
NFS
Server
NFS
Server
NFS
Server
NFS
Server
Client
10/11/11
©
MapR
Confiden0al
20
- 21. Local
server
Applica0on
NFS
Server
Client
Cluster
Nodes
10/11/11
©
MapR
Confiden0al
21
- 22. Universal
export
to
self
Cluster
Nodes
Task
NFS
Cluster
Server
Node
10/11/11
©
MapR
Confiden0al
22
- 23. Nodes
are
iden0cal
Task
Task
NFS
NFS
Cluster
Server
Node
Cluster
Server
Node
Task
NFS
Cluster
Server
Node
10/11/11
©
MapR
Confiden0al
23
- 24. Applica0on
architecture
• High
performance
map-‐reduce
is
nice
• But
algorithmic
flexibility
is
even
nicer
10/11/11
©
MapR
Confiden0al
24
- 25. Sharded
text
Indexing
Assign
documents
Index
text
to
local
disk
to
shards
and
then
copy
index
to
distributed
file
store
Clustered
Reducer
index
storage
Input
Map
documents
Copy
to
local
disk
Local
required
before
Local
typically
disk
Search
index
can
be
loaded
disk
Engine
10/11/11
©
MapR
Confiden0al
25
- 26. Sharded
text
indexing
• Mapper
assigns
document
to
shard
• Shard
is
usually
hash
of
document
id
• Reducer
indexes
all
documents
for
a
shard
• Indexes
created
on
local
disk
• On
success,
copy
index
to
DFS
• On
failure,
delete
local
files
• Must
avoid
directory
collisions
• can’t
use
shard
id!
• Must
manage
and
reclaim
local
disk
space
10/11/11
©
MapR
Confiden0al
26
- 27. Conven0onal
data
flow
Failure
of
search
engine
requires
Failure
of
a
reducer
another
download
causes
garbage
to
of
the
index
from
accumulate
in
the
clustered
storage.
Clustered
local
disk
Reducer
index
storage
Input
Map
documents
Local
disk
Local
Search
disk
Engine
10/11/11
©
MapR
Confiden0al
27
- 28. Simplified
NFS
data
flows
Index
to
task
work
directory
via
NFS
Search
Engine
Reducer
Input
Map
Clustered
documents
index
storage
Failure
of
a
reducer
Search
engine
is
cleaned
up
by
reads
mirrored
map-‐reduce
index
directly.
framework
10/11/11
©
MapR
Confiden0al
28
- 29. Simplified
NFS
data
flows
Search
Mirroring
allows
Engine
exact
placement
of
index
data
Reducer
Input
Map
documents
Search
Engine
Aribitrary
levels
of
replica0on
also
possible
Mirrors
10/11/11
©
MapR
Confiden0al
29
- 31. K-‐means
• Classic
E-‐M
based
algorithm
• Given
cluster
centroids,
• Assign
each
data
point
to
nearest
centroid
• Accumulate
new
centroids
• Rinse,
lather,
repeat
10/11/11
©
MapR
Confiden0al
31
- 32. K-‐means,
the
movie
Centroids
I
n
Assign
Aggregate
p
to
new
u
Nearest
centroids
t
centroid
10/11/11
©
MapR
Confiden0al
32
- 36. Old
tricks,
new
dogs
Read
from
local
disk
• Mapper
from
distributed
cache
• Assign
point
to
cluster
Read
from
• Emit
cluster
id,
(1,
point)
HDFS
to
local
disk
by
distributed
cache
• Combiner
and
reducer
• Sum
counts,
weighted
sum
of
points
• Emit
cluster
id,
(n,
sum/n)
WriQen
by
map-‐reduce
• Output
to
HDFS
10/11/11
©
MapR
Confiden0al
36
- 37. Old
tricks,
new
dogs
• Mapper
Read
• Assign
point
to
cluster
from
• Emit
cluster
id,
(1,
point)
NFS
• Combiner
and
reducer
• Sum
counts,
weighted
sum
of
points
• Emit
cluster
id,
(n,
sum/n)
WriQen
by
map-‐reduce
• Output
to
HDFS
MapR
FS
10/11/11
©
MapR
Confiden0al
37
- 38. Poor
man’s
Pregel
• Mapper
while not done:!
read and accumulate input models!
for each input:!
accumulate model!
write model!
synchronize!
reset input format!
emit summary!
• Lines
in
bold
can
use
conven0onal
I/O
via
NFS
10/11/11
©
MapR
Confiden0al
38
38
- 39. Click
modeling
architecture
Side-‐data
Now
via
NFS
I
Feature
n
Sequen0al
extrac0on
Data
p
SGD
and
join
u
Learning
down
t
sampling
Map-‐reduce
10/11/11
©
MapR
Confiden0al
39
- 40. Click
modeling
architecture
Side-‐data
Map-‐reduce
cooperates
Sequen0al
with
NFS
SGD
Learning
Sequen0al
SGD
I
Learning
Feature
n
Sequen0al
extrac0on
Data
p
SGD
and
join
u
Learning
down
t
sampling
Sequen0al
SGD
Learning
Map-‐reduce
Map-‐reduce
10/11/11
©
MapR
Confiden0al
40
- 42. Hybrid
model
flow
Feature
extrac0on
and
Down
down
sampling
stream
modeling
Map-‐reduce
Deployed
Map-‐reduce
Model
SVD
(PageRank)
(spectral)
??
10/11/11
©
MapR
Confiden0al
42
- 44. Hybrid
model
flow
Feature
extrac0on
and
Down
down
sampling
stream
modeling
Deployed
Model
SVD
(PageRank)
(spectral)
Sequen0al
Map-‐reduce
10/11/11
©
MapR
Confiden0al
44
- 46. Trivial
visualiza0on
interface
• Map-‐reduce
output
is
visible
via
NFS
$ R!
> x <- read.csv(“/mapr/my.cluster/home/ted/data/foo.out”)!
> plot(error ~ t, x)!
> q(save=‘n’)!
• Legacy
visualiza0on
just
works
10/11/11
©
MapR
Confiden0al
46
- 47. Conclusions
• We
used
to
know
all
this
• Tab
comple0on
used
to
work
• 5
years
of
work-‐arounds
have
clouded
our
memories
• We
just
have
to
remember
the
future
10/11/11
©
MapR
Confiden0al
47