SlideShare una empresa de Scribd logo
1 de 16
MapReduce
michel.bruley@teradata.com

Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …

April 2012

www.decideo.fr/bruley
What is MapReduce?
Restricted parallel programming model meant for large
clusters
– User implements Map() and Reduce() functions
Parallel computing framework
– Libraries take care of EVERYTHING else
• Parallelization
• Fault Tolerance
• Data Distribution
• Load Balancing
Useful model for many practical tasks
www.decideo.fr/bruley
Map and Reduce
The idea of Map, and Reduce is 40+ year old
– Present in all Functional Programming Languages.
– See, e.g., APL, Lisp and ML
Alternate names for Map: Apply-All
Higher Order Functions
– take function definitions as arguments, or
– return a function as output
Map and Reduce are higher-order functions.

www.decideo.fr/bruley
Map and Reduce Functions
Functions borrowed from functional programming
languages (eg. Lisp)
Map()
– Process a key/value pair to generate intermediate
key/value pairs
Reduce()
– Merge all intermediate values associated with the same
key

www.decideo.fr/bruley
Example: Counting Words
Map()
– Input <filename, file text>
– Parses file and emits <word, count> pairs
• eg. <”hello”, 1>
Reduce()
– Sums all values for the same key and emits <word,
TotalCount>
• eg. <”hello”, (3 5 2 7)> => <”hello”, 17>

www.decideo.fr/bruley
Execution on Clusters
1.

Input files split (M splits)

2.

Assign Master & Workers

3.

Map tasks

4.

Writing intermediate data to disk (R regions)

5.

Intermediate data read & sort

6.

Reduce tasks

7.

Return

www.decideo.fr/bruley
Map/Reduce Cluster
Implementation
Input
files

M map Intermediate
tasks
files

R reduce
tasks

split 0
split 1
split 2
split 3
split 4
Several map or
reduce tasks can
run on a single
computer
www.decideo.fr/bruley

Output
files
Output 0
Output 1

Each intermediate
file is divided into R
partitions, by
partitioning function

Each reduce task
corresponds to
one partition
Map Reduce vs. Parallel
Databases
Map Reduce widely used for parallel processing
– Google, Yahoo, and 100’s of other companies
– Example uses: compute PageRank, build keyword indices,
do data analysis of web click logs, ….
Database people say:
– but parallel databases have been doing this for decades
Map Reduce people say:
– we operate at scales of 1000’s of machines
– We handle failures seamlessly
– We allow procedural code in map and reduce and allow
data of any type
www.decideo.fr/bruley
Typical MapReduce Cluster

www.decideo.fr/bruley
Map Reduce Implementations
Google
– Not available outside Google
Hadoop
– An open-source implementation in Java
– Uses HDFS for stable storage
– Download: http://lucene.apache.org/hadoop/
Teradata Aster
– Cluster-optimized SQL Database that also implements
MapReduce
• IITB alumnus among founders
And several others, such as Cassandra at Facebook, etc.
www.decideo.fr/bruley
MapReduce v. Hadoop
MapReduce

Hadoop

Org

Google

Yahoo/Apache

Impl

C++

Java

Distributed
GFS
File Sys

HDFS

Data Base Bigtable

HBase

Distributed
Chubby
lock mgr

ZooKeeper

www.decideo.fr/bruley
Solutions Stack for Teradata Aster

Data
Integration
/ ETL

Business
Intelligence
Tools

Query
Tools

Analytics
Specialists

Systems Management

Aster Data
Ecosystem

Security

Aster Data nCluster
Operating System
Servers

Cloud Infrastructure
Storage

www.decideo.fr/bruley

Aster Data
Platform
Infrastructure
Teradata Aster Platform Infrastructure
For physical infrastructure (non-cloud) deployments
Aster Data
Analytic
Platform

nCluster
nCluster

Aster Data nCluster packaged software

Operating
System

Certified Linux operating system

Server
Hardware

Certified commodity (x86) server
hardware with internal storage

www.decideo.fr/bruley
Teradata Aster Infrastructure
For cloud deployments
Aster Data
Analytic
Platform

nCluster
nCluster

Aster Data nCluster packaged software

Operating
System

Compute
Instance

Storage

www.decideo.fr/bruley

Linux operating system

CC
CC

xLarge
xLarge

EBS
EBS
Ephemeral
Ephemeral

Compute instance from cloud provider
(e.g. Amazon Web Services EC2)
Storage connected to cloud computing
capacity
Teradata Aster Architecture for
Analytics
Your Analytics & Advanced Reporting
Applications
App

App

App

App

• Support for in-database processing of custom
applications written in broad variety of languages
• Integration with third-party packaged software via
ODBC/JDBC or in-database integration

Aster Data nCluster
Analytic Functions and Frameworks

• Rich libraries of MapReduce analytics from Aster
Data and partners
• Visual development environment--develop in hours

Unified Interface

• Standard SQL interface
• MapReduce processing integrated with SQL via
SQL-MapReduce interface

SQL

SQL-MapReduce

Analytics Processing Engines
SQL

MapReduce

Massively Parallel Data Stores

www.decideo.fr/bruley

…

• Optimized SQL engine
• Fully-integrated in-database MapReduce
• Hybrid row/column DBMS
• Linear, incremental scalability
• Commodity hardware
Teradata Aster Ecosystem
Partner

Product

Product
release

Platform for Certification

MicroStrategy

Intelligence Server

9.2.1 32-bit

Windows 7, Enterprise Edition SP1, 32-bit, 64-bit

SAP

Business Objects

XI 3.1

Windows 2008, 32-bit

Informatica

Powercenter

9.0.1

Client: Windows 2003/2008 Server 32 bit.
Server: Windows 2003/2008 Server 32 bit and 64 bit

IBM

Cognos

10.1FP1

n/a

Tableau

Tableau Server

6

Windows (SS: TBU)

Microsoft

SSLS, SSAS,
SSFS, SSIS

SQL Server
2008

.NET Framework 2.0
Windows Server, 2008 64-bit
Windows 2003, 32-bit

*Oracle BIEE certification currently in process

www.decideo.fr/bruley

Más contenido relacionado

La actualidad más candente

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...CloudxLab
 
System models in distributed system
System models in distributed systemSystem models in distributed system
System models in distributed systemishapadhy
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingMinhazul Arefin
 
Cloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsCloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsDr. Sunil Kr. Pandey
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoopdarugar
 
Cloud Computing Security Challenges
Cloud Computing Security ChallengesCloud Computing Security Challenges
Cloud Computing Security ChallengesYateesh Yadav
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bankpkaviya
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map ReduceApache Apex
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computingVajira Thambawita
 
Application of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingApplication of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingMohammad Mustaqeem
 
Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commandsbispsolutions
 

La actualidad más candente (20)

Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
 
System models in distributed system
System models in distributed systemSystem models in distributed system
System models in distributed system
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computing
 
Cloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsCloud Security, Standards and Applications
Cloud Security, Standards and Applications
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
 
Cloud Computing Security Challenges
Cloud Computing Security ChallengesCloud Computing Security Challenges
Cloud Computing Security Challenges
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
IaaS, SaaS, PasS : Cloud Computing
IaaS, SaaS, PasS : Cloud ComputingIaaS, SaaS, PasS : Cloud Computing
IaaS, SaaS, PasS : Cloud Computing
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
Application of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingApplication of MapReduce in Cloud Computing
Application of MapReduce in Cloud Computing
 
Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commands
 
Trends in distributed systems
Trends in distributed systemsTrends in distributed systems
Trends in distributed systems
 
Hadoop
HadoopHadoop
Hadoop
 

Similar a Map Reduce

Meethadoop
MeethadoopMeethadoop
MeethadoopIIIT-H
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overviewharithakannan
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5RojaT4
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsGeoffrey Fox
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsGeoffrey Fox
 
Lightening Fast Big Data Analytics using Apache Spark
Lightening Fast Big Data Analytics using Apache SparkLightening Fast Big Data Analytics using Apache Spark
Lightening Fast Big Data Analytics using Apache SparkManish Gupta
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLAdam Muise
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsRobert Grossman
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analyticsAvinash Pandu
 
Managing Big data Module 3 (1st part)
Managing Big data Module 3 (1st part)Managing Big data Module 3 (1st part)
Managing Big data Module 3 (1st part)Soumee Maschatak
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonVitthal Gogate
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big pictureJ S Jodha
 

Similar a Map Reduce (20)

Meethadoop
MeethadoopMeethadoop
Meethadoop
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Lightening Fast Big Data Analytics using Apache Spark
Lightening Fast Big Data Analytics using Apache SparkLightening Fast Big Data Analytics using Apache Spark
Lightening Fast Big Data Analytics using Apache Spark
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Big data concepts
Big data conceptsBig data concepts
Big data concepts
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETL
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analytics
 
Managing Big data Module 3 (1st part)
Managing Big data Module 3 (1st part)Managing Big data Module 3 (1st part)
Managing Big data Module 3 (1st part)
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
 
Hadoop ppt2
Hadoop ppt2Hadoop ppt2
Hadoop ppt2
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big picture
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 

Más de Michel Bruley

Religion : Dieu y es-tu ? (les articles)
Religion : Dieu y es-tu ? (les articles)Religion : Dieu y es-tu ? (les articles)
Religion : Dieu y es-tu ? (les articles)Michel Bruley
 
Réflexion sur les religions : Dieu y es-tu ?
Réflexion sur les religions : Dieu y es-tu ?Réflexion sur les religions : Dieu y es-tu ?
Réflexion sur les religions : Dieu y es-tu ?Michel Bruley
 
La chute de l'Empire romain comme modèle.pdf
La chute de l'Empire romain comme modèle.pdfLa chute de l'Empire romain comme modèle.pdf
La chute de l'Empire romain comme modèle.pdfMichel Bruley
 
Synthèse sur Neuville.pdf
Synthèse sur Neuville.pdfSynthèse sur Neuville.pdf
Synthèse sur Neuville.pdfMichel Bruley
 
Propos sur des sujets qui m'ont titillé.pdf
Propos sur des sujets qui m'ont titillé.pdfPropos sur des sujets qui m'ont titillé.pdf
Propos sur des sujets qui m'ont titillé.pdfMichel Bruley
 
Propos sur les Big Data.pdf
Propos sur les Big Data.pdfPropos sur les Big Data.pdf
Propos sur les Big Data.pdfMichel Bruley
 
Georges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
Georges Anselmi - 1914 - 1918 Campagnes de France et d'OrientGeorges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
Georges Anselmi - 1914 - 1918 Campagnes de France et d'OrientMichel Bruley
 
Poc banking industry - Churn
Poc banking industry - ChurnPoc banking industry - Churn
Poc banking industry - ChurnMichel Bruley
 
Big Data POC in communication industry
Big Data POC in communication industryBig Data POC in communication industry
Big Data POC in communication industryMichel Bruley
 
Photos de famille 1895 1966
Photos de famille 1895   1966Photos de famille 1895   1966
Photos de famille 1895 1966Michel Bruley
 
Compilation d'autres textes de famille
Compilation d'autres textes de familleCompilation d'autres textes de famille
Compilation d'autres textes de familleMichel Bruley
 
Textes de famille concernant les guerres (1814 - 1944)
Textes de famille concernant les guerres (1814 - 1944)Textes de famille concernant les guerres (1814 - 1944)
Textes de famille concernant les guerres (1814 - 1944)Michel Bruley
 
Recette de la dinde au whisky
Recette de la dinde au whiskyRecette de la dinde au whisky
Recette de la dinde au whiskyMichel Bruley
 
Les 2 guerres de René Puig
Les 2 guerres de René PuigLes 2 guerres de René Puig
Les 2 guerres de René PuigMichel Bruley
 
Une societe se_presente
Une societe se_presenteUne societe se_presente
Une societe se_presenteMichel Bruley
 
Dossiers noirs va 4191
Dossiers noirs va 4191Dossiers noirs va 4191
Dossiers noirs va 4191Michel Bruley
 
Irfm mini guide de mauvaise conduite
Irfm mini guide de mauvaise  conduiteIrfm mini guide de mauvaise  conduite
Irfm mini guide de mauvaise conduiteMichel Bruley
 
Estissac et thuisy 2017
Estissac et thuisy   2017Estissac et thuisy   2017
Estissac et thuisy 2017Michel Bruley
 

Más de Michel Bruley (20)

Religion : Dieu y es-tu ? (les articles)
Religion : Dieu y es-tu ? (les articles)Religion : Dieu y es-tu ? (les articles)
Religion : Dieu y es-tu ? (les articles)
 
Réflexion sur les religions : Dieu y es-tu ?
Réflexion sur les religions : Dieu y es-tu ?Réflexion sur les religions : Dieu y es-tu ?
Réflexion sur les religions : Dieu y es-tu ?
 
La chute de l'Empire romain comme modèle.pdf
La chute de l'Empire romain comme modèle.pdfLa chute de l'Empire romain comme modèle.pdf
La chute de l'Empire romain comme modèle.pdf
 
Synthèse sur Neuville.pdf
Synthèse sur Neuville.pdfSynthèse sur Neuville.pdf
Synthèse sur Neuville.pdf
 
Propos sur des sujets qui m'ont titillé.pdf
Propos sur des sujets qui m'ont titillé.pdfPropos sur des sujets qui m'ont titillé.pdf
Propos sur des sujets qui m'ont titillé.pdf
 
Propos sur les Big Data.pdf
Propos sur les Big Data.pdfPropos sur les Big Data.pdf
Propos sur les Big Data.pdf
 
Sun tzu
Sun tzuSun tzu
Sun tzu
 
Georges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
Georges Anselmi - 1914 - 1918 Campagnes de France et d'OrientGeorges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
Georges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
 
Poc banking industry - Churn
Poc banking industry - ChurnPoc banking industry - Churn
Poc banking industry - Churn
 
Big Data POC in communication industry
Big Data POC in communication industryBig Data POC in communication industry
Big Data POC in communication industry
 
Photos de famille 1895 1966
Photos de famille 1895   1966Photos de famille 1895   1966
Photos de famille 1895 1966
 
Compilation d'autres textes de famille
Compilation d'autres textes de familleCompilation d'autres textes de famille
Compilation d'autres textes de famille
 
J'aime BRULEY
J'aime BRULEYJ'aime BRULEY
J'aime BRULEY
 
Textes de famille concernant les guerres (1814 - 1944)
Textes de famille concernant les guerres (1814 - 1944)Textes de famille concernant les guerres (1814 - 1944)
Textes de famille concernant les guerres (1814 - 1944)
 
Recette de la dinde au whisky
Recette de la dinde au whiskyRecette de la dinde au whisky
Recette de la dinde au whisky
 
Les 2 guerres de René Puig
Les 2 guerres de René PuigLes 2 guerres de René Puig
Les 2 guerres de René Puig
 
Une societe se_presente
Une societe se_presenteUne societe se_presente
Une societe se_presente
 
Dossiers noirs va 4191
Dossiers noirs va 4191Dossiers noirs va 4191
Dossiers noirs va 4191
 
Irfm mini guide de mauvaise conduite
Irfm mini guide de mauvaise  conduiteIrfm mini guide de mauvaise  conduite
Irfm mini guide de mauvaise conduite
 
Estissac et thuisy 2017
Estissac et thuisy   2017Estissac et thuisy   2017
Estissac et thuisy 2017
 

Último

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Roland Driesen
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...rajveerescorts2022
 
HONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsHONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsMichael W. Hawkins
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfPaul Menig
 

Último (20)

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
HONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsHONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael Hawkins
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdf
 

Map Reduce

  • 1. MapReduce michel.bruley@teradata.com Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, … April 2012 www.decideo.fr/bruley
  • 2. What is MapReduce? Restricted parallel programming model meant for large clusters – User implements Map() and Reduce() functions Parallel computing framework – Libraries take care of EVERYTHING else • Parallelization • Fault Tolerance • Data Distribution • Load Balancing Useful model for many practical tasks www.decideo.fr/bruley
  • 3. Map and Reduce The idea of Map, and Reduce is 40+ year old – Present in all Functional Programming Languages. – See, e.g., APL, Lisp and ML Alternate names for Map: Apply-All Higher Order Functions – take function definitions as arguments, or – return a function as output Map and Reduce are higher-order functions. www.decideo.fr/bruley
  • 4. Map and Reduce Functions Functions borrowed from functional programming languages (eg. Lisp) Map() – Process a key/value pair to generate intermediate key/value pairs Reduce() – Merge all intermediate values associated with the same key www.decideo.fr/bruley
  • 5. Example: Counting Words Map() – Input <filename, file text> – Parses file and emits <word, count> pairs • eg. <”hello”, 1> Reduce() – Sums all values for the same key and emits <word, TotalCount> • eg. <”hello”, (3 5 2 7)> => <”hello”, 17> www.decideo.fr/bruley
  • 6. Execution on Clusters 1. Input files split (M splits) 2. Assign Master & Workers 3. Map tasks 4. Writing intermediate data to disk (R regions) 5. Intermediate data read & sort 6. Reduce tasks 7. Return www.decideo.fr/bruley
  • 7. Map/Reduce Cluster Implementation Input files M map Intermediate tasks files R reduce tasks split 0 split 1 split 2 split 3 split 4 Several map or reduce tasks can run on a single computer www.decideo.fr/bruley Output files Output 0 Output 1 Each intermediate file is divided into R partitions, by partitioning function Each reduce task corresponds to one partition
  • 8. Map Reduce vs. Parallel Databases Map Reduce widely used for parallel processing – Google, Yahoo, and 100’s of other companies – Example uses: compute PageRank, build keyword indices, do data analysis of web click logs, …. Database people say: – but parallel databases have been doing this for decades Map Reduce people say: – we operate at scales of 1000’s of machines – We handle failures seamlessly – We allow procedural code in map and reduce and allow data of any type www.decideo.fr/bruley
  • 10. Map Reduce Implementations Google – Not available outside Google Hadoop – An open-source implementation in Java – Uses HDFS for stable storage – Download: http://lucene.apache.org/hadoop/ Teradata Aster – Cluster-optimized SQL Database that also implements MapReduce • IITB alumnus among founders And several others, such as Cassandra at Facebook, etc. www.decideo.fr/bruley
  • 11. MapReduce v. Hadoop MapReduce Hadoop Org Google Yahoo/Apache Impl C++ Java Distributed GFS File Sys HDFS Data Base Bigtable HBase Distributed Chubby lock mgr ZooKeeper www.decideo.fr/bruley
  • 12. Solutions Stack for Teradata Aster Data Integration / ETL Business Intelligence Tools Query Tools Analytics Specialists Systems Management Aster Data Ecosystem Security Aster Data nCluster Operating System Servers Cloud Infrastructure Storage www.decideo.fr/bruley Aster Data Platform Infrastructure
  • 13. Teradata Aster Platform Infrastructure For physical infrastructure (non-cloud) deployments Aster Data Analytic Platform nCluster nCluster Aster Data nCluster packaged software Operating System Certified Linux operating system Server Hardware Certified commodity (x86) server hardware with internal storage www.decideo.fr/bruley
  • 14. Teradata Aster Infrastructure For cloud deployments Aster Data Analytic Platform nCluster nCluster Aster Data nCluster packaged software Operating System Compute Instance Storage www.decideo.fr/bruley Linux operating system CC CC xLarge xLarge EBS EBS Ephemeral Ephemeral Compute instance from cloud provider (e.g. Amazon Web Services EC2) Storage connected to cloud computing capacity
  • 15. Teradata Aster Architecture for Analytics Your Analytics & Advanced Reporting Applications App App App App • Support for in-database processing of custom applications written in broad variety of languages • Integration with third-party packaged software via ODBC/JDBC or in-database integration Aster Data nCluster Analytic Functions and Frameworks • Rich libraries of MapReduce analytics from Aster Data and partners • Visual development environment--develop in hours Unified Interface • Standard SQL interface • MapReduce processing integrated with SQL via SQL-MapReduce interface SQL SQL-MapReduce Analytics Processing Engines SQL MapReduce Massively Parallel Data Stores www.decideo.fr/bruley … • Optimized SQL engine • Fully-integrated in-database MapReduce • Hybrid row/column DBMS • Linear, incremental scalability • Commodity hardware
  • 16. Teradata Aster Ecosystem Partner Product Product release Platform for Certification MicroStrategy Intelligence Server 9.2.1 32-bit Windows 7, Enterprise Edition SP1, 32-bit, 64-bit SAP Business Objects XI 3.1 Windows 2008, 32-bit Informatica Powercenter 9.0.1 Client: Windows 2003/2008 Server 32 bit. Server: Windows 2003/2008 Server 32 bit and 64 bit IBM Cognos 10.1FP1 n/a Tableau Tableau Server 6 Windows (SS: TBU) Microsoft SSLS, SSAS, SSFS, SSIS SQL Server 2008 .NET Framework 2.0 Windows Server, 2008 64-bit Windows 2003, 32-bit *Oracle BIEE certification currently in process www.decideo.fr/bruley