SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
1©MapR Technologies - Confidential
MapReduce Improvements in the
MapR Hadoop Distribution
Adam Bordelon, Senior Software Engineer at MapR
Big Data Madison meetup - 9/26/2013
2
What's this all about?
●
Background on Hadoop
●
Big Data: Distributed Filesystems
●
Big Compute:
– MapReduce
– Beyond MapReduce
●
Q&A
2
3
Hadoop History
http://s.wsj.net/public/resources/images/MI-BX925_GOOGLE_G_20130818173254.jpg
4
Big Data: Distributed FileSystems
Volume, Variety, Velocity:
Can't have big data without a scalable filesystem
http://www.lbisoftware.com/blog/wp-content/uploads/2013/06/data_mountain1.jpg
5
HDFS Architecture
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
6
HDFS Architectural Flaws
● Created for storing crawled web-page data
● Files cannot be modified once written/closed.
– Write-once; append-only
● Files cannot be read before they are closed.
– Must batch-load data
● NameNode stores (in memory)
– Directory/file tree, file->block mapping
– Block replica locations
● NameNode only scales to ~100 Million files
– Some users run jobs to concatenate small files
● Written in Java, slows during GC.
7
Solution: MapR FileSystem
● Visionary CTO/Co-Founder: M.C. Srivas
– Ran Google search infrastructure team
– Chief Storage Architect at Spinnaker Networks
● Take a step back: What kind of DFS do we need in
Hadoop/Distributed-Computer?
– Easy, Scalable, Reliable
● Want traditional apps to work with DFS
– Support random Read/Write
– Standard FS interface (NFS)
● HDFS compatible
– Drop-in replacement, no recompile
8
Easy: Posix-compliant NFS
9
Easy: MapR Volumes
Groups related files/directories
into a single tree structure so
they can be easily organized,
managed, and secured.
●
Replication factor
●
Scheduled snapshots, mirroring
●
Data placement control
– By device-type, rack, or
geographic location
●
Quotas and usage tracking
●
Administrative permissions
100K+ Volumes are okay
10

Each container contains

Directories & files

Data blocks

Replicated on servers

No need to manage directly

Use MapR Volumes
Scalable: Containers
Files/directories are sharded into
blocks, which are placed into mini-NNs
(containers) on disks
Containers are
16-32 GB disk
segments,
placed on
nodes
11
CLDB
Scalable: Container Location DB
N1, N2
N3, N2
N1, N2
N1, N3
N3, N2
N1
N2
N3
Container location
database (CLDB) keeps
track of nodes hosting
each container and
replication chain order

Each container has a replication chain

Updates are transactional

Failures are handled by rearranging replication

Clients cache container locations
12
Scalability Statistics
Containers represent 16 - 32GB of data

Each can hold up to 1 Billion files and directories

100M containers = ~ 2 Exabytes (a very large cluster)
250 bytes DRAM to cache a container

25GB to cache all containers for 2EB cluster
− But not necessary, can page to disk

Typical large 10PB cluster needs 2GB
Container-reports are 100x - 1000x < HDFS
block-reports

Serve 100x more data-nodes

Increase container size to 64G to serve 4EB cluster

MapReduce performance not affected
13
Record-breaking Speed
Benchmark MapR
2.1.1
CDH 4.1.1 MapR
Speed
Increase
Terasort (1x replication, compression disabled)
Total 13m 35s 26m 6s 2X
Map 7m 58s 21m 8s 3X
Reduce 13m 32s 23m 37s 1.8X
DFSIO throughput/node
Read 1003 MB/s 656 MB/s 1.5X
Write 924 MB/s 654 MB/s 1.4X
YCSB (50% read, 50% update)
Throughput 36,584.4
op/s
12,500.5
op/s
2.9X
Runtime 3.80 hr 11.11 hr 2.9X
YCSB (95% read, 5% update)
Throughput 24,704.3
op/s
10,776.4
op/s
2.3X
Runtime 0.56 hr 1.29 hr 2.3X
MapR
w/Google
Apache
Hadoop
Time 54s 62s
Nodes 1003 1460
Disks 1003 5840
Cores 4012 11680
NEW WORLD RECORD
BREAK TERASORT MINUTE BARRIER
Benchmark hardware configuration:
10 servers, 12x2 cores (2.4 GHz), 12x2TB, 48 GB, 1x10GbE
14
Reliable: CLDB High Availability
● As easy as installing CLDB role on more nodes
– Writes go to CLDB master, replicated to slaves
– CLDB slaves can serve reads
● Distributed container metadata, so CLDB only
stores/recovers container locations
– Instant restart (<2 seconds), no single POF
● Shared nothing architecture
● (NFS Multinode HA too)
15
vs. Federated NN, NN HA
● Federated NameNodes
– Statically partition namespaces (like Volumes)
– Need additional NN (plus a standby) for each namespace
– Federated NN only in Hadoop-2.x (beta)
● NameNode HA
– NameNode responsible for both fs-namespace (metadata) info and block
locations; more data to checkpoint/recover.
– Starting standby NN from cold state can take tens-of-minutes for metadata,
an hour for block-locations. Need a hot standby.
– Metadata state
● All name space edits logged to shared (NFS/NAS) R/W storage, which must
also be HA; Standby polls edit log for changes.
● Or use Quorum Journal Manager, separate service/nodes
– Block locations
● Data nodes send block reports, location updates, heartbeats to both NNs
16
Reliable: Consistent Snapshots
● Automatic
de-duplication
● Saves space by
sharing blocks
● Lightning fast
● Zero performance loss
on writing to original
● Scheduled,
or on-demand
● Easy recovery with
drag and drop
17
Reliable: Mirroring
18
MapR Filesystem Summary
● Easy
– Direct Access NFS
– MapR Volumes
● Fast
– C++ vs. Java
– Direct disk access,
no layered filesystems
– Lockless transactions
– High-speed RPC
– Native compression
● Scalable
– Containers,
distributed metadata
– Container Location DB
● Reliable
– CLDB High Availability
– Snapshots
– Mirroring
19
Big Compute: MapReduce
http://developer.yahoo.com/hadoop/tutorial/module4.html
20
Fast: Direct Shuffle
● Apache Shuffle
– Write map-outputs/spills to local file system
– Merge partitions for a map output into one file, index into it
– Reducers request partitions from Mappers' Http servlets
● MapR Direct Shuffle
– Write to Local Volume in MapR FS (rebalancing)
– Map-output file per reducer (no index file)
– Send shuffleRootFid with MapTaskCompletion on heartbeat
– Direct RPC from Reducer to Mapper using Fid
– Copy is just a file-system copy; no Http overhead
– More copy threads, wider merges
21
Fast: Express Lane
● Long-running jobs shouldn't hog all the slots in the
cluster and starve small, fast jobs (e.g. Hive queries)
● One or more small slots reserved on each node for
running small jobs
● Small jobs: <10 maps/reds, small input, time limit
22
Reliable: JobTracker HA
23
Easy: Label-based Scheduling
● Assign labels to nodes or regex/glob expressions for nodes
– perfnode1* → “production”
– /.*ssd[0-9]*/ → “fast_ssd”
● Create label expressions for jobs/queues
– Queue “fast_prod” → “production && fast_ss”
● Tasks from these jobs/queues will only be assigned to nodes whose
labels match the expression.
● Combine with Data Placement policies for data and compute locality
● No static partitioning necessary
– Frequent labels file refresh
– New nodes automatically fall into appropriate regex/glob labels
– New jobs can specify label expression or use queue's or both
● http://www.mapr.com/doc/display/MapR/Placing+Jobs+on+Specified+Nodes
24
Other Improvements
● Parallel Split Computations in JobClient
– Might as well multi-thread it!
● Runaway Job Protection
– One user's fork-bomb shouldn't degrade others' performance
– CPU/memory firewalls protect system processes
● Map-side join locality
– Files in same directory/container follow same replication chain
– Same key ranges likely to be co-located on same node.
● Zero-config XML
– XML parsing takes too much time
25
MapR MapReduce Summary
● Fast
– Direct Shuffle
– Express Lane
– Parallel Split Computation
– Map-side Join Locality
– Zero-config XML
● Reliable
– JobTracker HA
– Runaway Job Protection
● Easy
– Label-based Scheduling
26
Beyond MapReduce...
http://www.nasa.gov/sites/default/files/potw1335a_0.jpg
27
M7: Enterprise-Grade HBase
Disks
ext3
JVM
DFS
JVM
HBase
Other
Distributions
Disks
Unified
Easy Dependable Fast
No RegionServers No compactions Consistent low latency
Seamless splits Instant recovery
from node failure
Real-time in-memory
configuration
Automatic merges Snapshots Disk and network
compression
In-memory column families Mirroring Reduced I/O to disk
 Unified Data Platform
 Increased Performance
 Simplified Administration
28
Apache Drill
Interactive analysis of Big Data using standard SQL
Based on Google Dremel
Interactive queries
Data analyst
Reporting
100 ms-20 min
Data mining
Modeling
Large ETL
20 min-20 hr
MapReduce
Hive
Pig
Fas
t
• Low latency queries
• Columnar execution
• Complement native interfaces
and MapReduce/Hive/Pig
Op
en
• Community driven open source project
• Under Apache Software Foundation
Mo
der
n
• Standard ANSI SQL:2003 (select/into)
• Nested/hierarchical data support
• Schema is optional
• Supports RDBMS, Hadoop and NoSQL
29
Apache YARN aka MR2
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
30
Why MapR?
http://www.mapr.com/products/why-mapr
31
Contact Us!
I'm not in Sales, so go to mapr.com to learn more:
– Integrations with AWS, GCE, Ubuntu, Lucidworks
– Partnerships, Customers
– Support, Training, Pricing
– Ecosystem Components
We're hiring!
University of Wisconsin-Madison Career Fair tomorrow
Email me at: abordelon@maprtech.com
31
32©MapR Technologies - Confidential
Questions?

Más contenido relacionado

La actualidad más candente

HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 
Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Shivkumar Babshetty
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaDataWorks Summit
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Cloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Cloudera, Inc.
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Quick Introduction to Apache Tez
Quick Introduction to Apache TezQuick Introduction to Apache Tez
Quick Introduction to Apache TezGetInData
 
HBase Backups
HBase BackupsHBase Backups
HBase BackupsHBaseCon
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataDataWorks Summit
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)mundlapudi
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hiverxu
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jkEdureka!
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batchboorad
 
Challenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David TuckerChallenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David TuckerMapR Technologies
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 

La actualidad más candente (20)

HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ TwitterCross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Quick Introduction to Apache Tez
Quick Introduction to Apache TezQuick Introduction to Apache Tez
Quick Introduction to Apache Tez
 
HBase Backups
HBase BackupsHBase Backups
HBase Backups
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big Data
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
ha_module5
ha_module5ha_module5
ha_module5
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hive
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
Yahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at ScaleYahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at Scale
 
Challenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David TuckerChallenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David Tucker
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 

Similar a MapReduce Improvements in MapR Hadoop

Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keownCisco Canada
 
Hadoop disaster recovery
Hadoop disaster recoveryHadoop disaster recovery
Hadoop disaster recoverySandeep Singh
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2hdhappy001
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 Chris Almond
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceUwe Printz
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart PookEvention
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonSage Weil
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukAndrii Vozniuk
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 

Similar a MapReduce Improvements in MapR Hadoop (20)

Training
TrainingTraining
Training
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Hadoop2
Hadoop2Hadoop2
Hadoop2
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0
 
Linux Huge Pages
Linux Huge PagesLinux Huge Pages
Linux Huge Pages
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Hadoop disaster recovery
Hadoop disaster recoveryHadoop disaster recovery
Hadoop disaster recovery
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduce
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

MapReduce Improvements in MapR Hadoop

  • 1. 1©MapR Technologies - Confidential MapReduce Improvements in the MapR Hadoop Distribution Adam Bordelon, Senior Software Engineer at MapR Big Data Madison meetup - 9/26/2013
  • 2. 2 What's this all about? ● Background on Hadoop ● Big Data: Distributed Filesystems ● Big Compute: – MapReduce – Beyond MapReduce ● Q&A 2
  • 4. 4 Big Data: Distributed FileSystems Volume, Variety, Velocity: Can't have big data without a scalable filesystem http://www.lbisoftware.com/blog/wp-content/uploads/2013/06/data_mountain1.jpg
  • 6. 6 HDFS Architectural Flaws ● Created for storing crawled web-page data ● Files cannot be modified once written/closed. – Write-once; append-only ● Files cannot be read before they are closed. – Must batch-load data ● NameNode stores (in memory) – Directory/file tree, file->block mapping – Block replica locations ● NameNode only scales to ~100 Million files – Some users run jobs to concatenate small files ● Written in Java, slows during GC.
  • 7. 7 Solution: MapR FileSystem ● Visionary CTO/Co-Founder: M.C. Srivas – Ran Google search infrastructure team – Chief Storage Architect at Spinnaker Networks ● Take a step back: What kind of DFS do we need in Hadoop/Distributed-Computer? – Easy, Scalable, Reliable ● Want traditional apps to work with DFS – Support random Read/Write – Standard FS interface (NFS) ● HDFS compatible – Drop-in replacement, no recompile
  • 9. 9 Easy: MapR Volumes Groups related files/directories into a single tree structure so they can be easily organized, managed, and secured. ● Replication factor ● Scheduled snapshots, mirroring ● Data placement control – By device-type, rack, or geographic location ● Quotas and usage tracking ● Administrative permissions 100K+ Volumes are okay
  • 10. 10  Each container contains  Directories & files  Data blocks  Replicated on servers  No need to manage directly  Use MapR Volumes Scalable: Containers Files/directories are sharded into blocks, which are placed into mini-NNs (containers) on disks Containers are 16-32 GB disk segments, placed on nodes
  • 11. 11 CLDB Scalable: Container Location DB N1, N2 N3, N2 N1, N2 N1, N3 N3, N2 N1 N2 N3 Container location database (CLDB) keeps track of nodes hosting each container and replication chain order  Each container has a replication chain  Updates are transactional  Failures are handled by rearranging replication  Clients cache container locations
  • 12. 12 Scalability Statistics Containers represent 16 - 32GB of data  Each can hold up to 1 Billion files and directories  100M containers = ~ 2 Exabytes (a very large cluster) 250 bytes DRAM to cache a container  25GB to cache all containers for 2EB cluster − But not necessary, can page to disk  Typical large 10PB cluster needs 2GB Container-reports are 100x - 1000x < HDFS block-reports  Serve 100x more data-nodes  Increase container size to 64G to serve 4EB cluster  MapReduce performance not affected
  • 13. 13 Record-breaking Speed Benchmark MapR 2.1.1 CDH 4.1.1 MapR Speed Increase Terasort (1x replication, compression disabled) Total 13m 35s 26m 6s 2X Map 7m 58s 21m 8s 3X Reduce 13m 32s 23m 37s 1.8X DFSIO throughput/node Read 1003 MB/s 656 MB/s 1.5X Write 924 MB/s 654 MB/s 1.4X YCSB (50% read, 50% update) Throughput 36,584.4 op/s 12,500.5 op/s 2.9X Runtime 3.80 hr 11.11 hr 2.9X YCSB (95% read, 5% update) Throughput 24,704.3 op/s 10,776.4 op/s 2.3X Runtime 0.56 hr 1.29 hr 2.3X MapR w/Google Apache Hadoop Time 54s 62s Nodes 1003 1460 Disks 1003 5840 Cores 4012 11680 NEW WORLD RECORD BREAK TERASORT MINUTE BARRIER Benchmark hardware configuration: 10 servers, 12x2 cores (2.4 GHz), 12x2TB, 48 GB, 1x10GbE
  • 14. 14 Reliable: CLDB High Availability ● As easy as installing CLDB role on more nodes – Writes go to CLDB master, replicated to slaves – CLDB slaves can serve reads ● Distributed container metadata, so CLDB only stores/recovers container locations – Instant restart (<2 seconds), no single POF ● Shared nothing architecture ● (NFS Multinode HA too)
  • 15. 15 vs. Federated NN, NN HA ● Federated NameNodes – Statically partition namespaces (like Volumes) – Need additional NN (plus a standby) for each namespace – Federated NN only in Hadoop-2.x (beta) ● NameNode HA – NameNode responsible for both fs-namespace (metadata) info and block locations; more data to checkpoint/recover. – Starting standby NN from cold state can take tens-of-minutes for metadata, an hour for block-locations. Need a hot standby. – Metadata state ● All name space edits logged to shared (NFS/NAS) R/W storage, which must also be HA; Standby polls edit log for changes. ● Or use Quorum Journal Manager, separate service/nodes – Block locations ● Data nodes send block reports, location updates, heartbeats to both NNs
  • 16. 16 Reliable: Consistent Snapshots ● Automatic de-duplication ● Saves space by sharing blocks ● Lightning fast ● Zero performance loss on writing to original ● Scheduled, or on-demand ● Easy recovery with drag and drop
  • 18. 18 MapR Filesystem Summary ● Easy – Direct Access NFS – MapR Volumes ● Fast – C++ vs. Java – Direct disk access, no layered filesystems – Lockless transactions – High-speed RPC – Native compression ● Scalable – Containers, distributed metadata – Container Location DB ● Reliable – CLDB High Availability – Snapshots – Mirroring
  • 20. 20 Fast: Direct Shuffle ● Apache Shuffle – Write map-outputs/spills to local file system – Merge partitions for a map output into one file, index into it – Reducers request partitions from Mappers' Http servlets ● MapR Direct Shuffle – Write to Local Volume in MapR FS (rebalancing) – Map-output file per reducer (no index file) – Send shuffleRootFid with MapTaskCompletion on heartbeat – Direct RPC from Reducer to Mapper using Fid – Copy is just a file-system copy; no Http overhead – More copy threads, wider merges
  • 21. 21 Fast: Express Lane ● Long-running jobs shouldn't hog all the slots in the cluster and starve small, fast jobs (e.g. Hive queries) ● One or more small slots reserved on each node for running small jobs ● Small jobs: <10 maps/reds, small input, time limit
  • 23. 23 Easy: Label-based Scheduling ● Assign labels to nodes or regex/glob expressions for nodes – perfnode1* → “production” – /.*ssd[0-9]*/ → “fast_ssd” ● Create label expressions for jobs/queues – Queue “fast_prod” → “production && fast_ss” ● Tasks from these jobs/queues will only be assigned to nodes whose labels match the expression. ● Combine with Data Placement policies for data and compute locality ● No static partitioning necessary – Frequent labels file refresh – New nodes automatically fall into appropriate regex/glob labels – New jobs can specify label expression or use queue's or both ● http://www.mapr.com/doc/display/MapR/Placing+Jobs+on+Specified+Nodes
  • 24. 24 Other Improvements ● Parallel Split Computations in JobClient – Might as well multi-thread it! ● Runaway Job Protection – One user's fork-bomb shouldn't degrade others' performance – CPU/memory firewalls protect system processes ● Map-side join locality – Files in same directory/container follow same replication chain – Same key ranges likely to be co-located on same node. ● Zero-config XML – XML parsing takes too much time
  • 25. 25 MapR MapReduce Summary ● Fast – Direct Shuffle – Express Lane – Parallel Split Computation – Map-side Join Locality – Zero-config XML ● Reliable – JobTracker HA – Runaway Job Protection ● Easy – Label-based Scheduling
  • 27. 27 M7: Enterprise-Grade HBase Disks ext3 JVM DFS JVM HBase Other Distributions Disks Unified Easy Dependable Fast No RegionServers No compactions Consistent low latency Seamless splits Instant recovery from node failure Real-time in-memory configuration Automatic merges Snapshots Disk and network compression In-memory column families Mirroring Reduced I/O to disk  Unified Data Platform  Increased Performance  Simplified Administration
  • 28. 28 Apache Drill Interactive analysis of Big Data using standard SQL Based on Google Dremel Interactive queries Data analyst Reporting 100 ms-20 min Data mining Modeling Large ETL 20 min-20 hr MapReduce Hive Pig Fas t • Low latency queries • Columnar execution • Complement native interfaces and MapReduce/Hive/Pig Op en • Community driven open source project • Under Apache Software Foundation Mo der n • Standard ANSI SQL:2003 (select/into) • Nested/hierarchical data support • Schema is optional • Supports RDBMS, Hadoop and NoSQL
  • 29. 29 Apache YARN aka MR2 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
  • 31. 31 Contact Us! I'm not in Sales, so go to mapr.com to learn more: – Integrations with AWS, GCE, Ubuntu, Lucidworks – Partnerships, Customers – Support, Training, Pricing – Ecosystem Components We're hiring! University of Wisconsin-Madison Career Fair tomorrow Email me at: abordelon@maprtech.com 31
  • 32. 32©MapR Technologies - Confidential Questions?