SlideShare a Scribd company logo
1 of 15
Download to read offline
Mi Primer Map/Reduce

Rubén Orta @agileando
1

historia

2

implementación

3

netflix prize en python

4

enlaces
1

Big Data = Contar
CON
TAR

1
1

Jeff
Dean

Sanjay
Ghemawat
map (key , value)
new_value = a_function(value)
return new_key, new_value

2

reduce (key, value)
new_value = another_function(value)
return key, new_value
Dataset:

2

Millones de páginas web

Map
f()

f()

f()

f()

f()

f’()

f’()

f’()

f’()

f’()

for each word in document:
return (word, 1);

Reduce
total = 0
for each item in value:
total++
return (key, total);
2
3
import mincemeat

3

data = dict((f, read_data(f)) for f in data_files)
s = mincemeat.Server()
s.datasource = data
s.mapfn = mapfn
s.reducefn = reducefn
results = s.run_server (password = "ruben")
def mapfn(key, value):
lines = value.splitlines()
film_id = lines[0][:-1]
for line in lines[1:]:
items = line.split(",")
user_id = items[0]
rating = items[1]
date = items[2]
yield user_id, film_id

3
def reducefn(key, values):
number_of_films = 0
for value in values:
number_of_films += 1
return number_of_films

3
Papers

4

GFS
MapReduce
BigTable

http://research.google.com/archive/gfs.html
http://research.google.com/archive/mapreduce.html
http://research.google.com/archive/bigtable.html

Dynamo

http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf

Dremel
Spanner

http://research.google.com/pubs/pub36632.html
http://research.google.com/archive/spanner.html

Python
MinceMeat.py https://github.com/michaelfairley/mincemeatpy
Octo.py
http://code.google.com/p/octopy/
Netflix DataSet http://www.lifecrunch.biz/archives/207
Rubén Orta
http://www.slideshare.net/agileando/mi-primer-map-reduce

Blog
Twitter
GitHub

http://devspoke.com/
https://twitter.com/agileando
https://github.com/rubenorta

4
BUSCAMOS GENTE
PARA NUESTRO
EQUIPO
¿Quieres unirte?
*unix, scripting (python, perl)
devops

More Related Content

What's hot

Implementation of k-means clustering algorithm in C
Implementation of k-means clustering algorithm in CImplementation of k-means clustering algorithm in C
Implementation of k-means clustering algorithm in C
Kasun Ranga Wijeweera
 
Geolocation on Rails
Geolocation on RailsGeolocation on Rails
Geolocation on Rails
nebirhos
 
peRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysispeRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysis
Vyacheslav Arbuzov
 
Mapreduce: Theory and implementation
Mapreduce: Theory and implementationMapreduce: Theory and implementation
Mapreduce: Theory and implementation
Sri Prasanna
 
Map reduce (from Google)
Map reduce (from Google)Map reduce (from Google)
Map reduce (from Google)
Sri Prasanna
 

What's hot (20)

XIX PUG-PE - Pygame game development
XIX PUG-PE - Pygame game developmentXIX PUG-PE - Pygame game development
XIX PUG-PE - Pygame game development
 
Java Week6(B) Notepad
Java Week6(B)   NotepadJava Week6(B)   Notepad
Java Week6(B) Notepad
 
Java Week10 Notepad
Java Week10   NotepadJava Week10   Notepad
Java Week10 Notepad
 
MapReduce
MapReduceMapReduce
MapReduce
 
Implementation of k-means clustering algorithm in C
Implementation of k-means clustering algorithm in CImplementation of k-means clustering algorithm in C
Implementation of k-means clustering algorithm in C
 
Map reduce functional programming
Map reduce   functional programmingMap reduce   functional programming
Map reduce functional programming
 
Composite functions
Composite functionsComposite functions
Composite functions
 
PyParis - weather and climate data
PyParis - weather and climate dataPyParis - weather and climate data
PyParis - weather and climate data
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Data Types and Processing in ES6
Data Types and Processing in ES6Data Types and Processing in ES6
Data Types and Processing in ES6
 
Geolocation on Rails
Geolocation on RailsGeolocation on Rails
Geolocation on Rails
 
peRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysispeRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysis
 
Mapreduce: Theory and implementation
Mapreduce: Theory and implementationMapreduce: Theory and implementation
Mapreduce: Theory and implementation
 
Functional programming
Functional programming Functional programming
Functional programming
 
Elixir 5 minute intro
Elixir 5 minute introElixir 5 minute intro
Elixir 5 minute intro
 
Mysql count
Mysql countMysql count
Mysql count
 
Geolocation Databases in Ruby on Rails
Geolocation Databases in Ruby on RailsGeolocation Databases in Ruby on Rails
Geolocation Databases in Ruby on Rails
 
Map reduce (from Google)
Map reduce (from Google)Map reduce (from Google)
Map reduce (from Google)
 
Plotting position and velocity
Plotting position and velocityPlotting position and velocity
Plotting position and velocity
 
Admission for b.tech
Admission for b.techAdmission for b.tech
Admission for b.tech
 

Viewers also liked

Viewers also liked (7)

Camilo Sarasti - Director LATAM / App Tripda
Camilo Sarasti - Director LATAM / App TripdaCamilo Sarasti - Director LATAM / App Tripda
Camilo Sarasti - Director LATAM / App Tripda
 
Boltio: desarrollo exprés de una app para Android
Boltio: desarrollo exprés de una app para AndroidBoltio: desarrollo exprés de una app para Android
Boltio: desarrollo exprés de una app para Android
 
Qué es tripda
Qué es tripdaQué es tripda
Qué es tripda
 
Yelmo cines app
Yelmo cines appYelmo cines app
Yelmo cines app
 
Gudog
GudogGudog
Gudog
 
Emprendiendo con mi app
Emprendiendo con mi appEmprendiendo con mi app
Emprendiendo con mi app
 
La Nevera Roja desarrollo de un app nativa
La Nevera Roja desarrollo de un app nativaLa Nevera Roja desarrollo de un app nativa
La Nevera Roja desarrollo de un app nativa
 

Similar to Mi primer map reduce

Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and ImplementationDistributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
tugrulh
 
Functional programming in Python
Functional programming in PythonFunctional programming in Python
Functional programming in Python
Colin Su
 
Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...
Mario Fusco
 
Company_X_Data_Analyst_Challenge
Company_X_Data_Analyst_ChallengeCompany_X_Data_Analyst_Challenge
Company_X_Data_Analyst_Challenge
Mark Yashar
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
Moriyoshi Koizumi
 

Similar to Mi primer map reduce (20)

How to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | EdurekaHow to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | Edureka
 
Lec2 Mapred
Lec2 MapredLec2 Mapred
Lec2 Mapred
 
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and ImplementationDistributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
 
Functional programming in Python
Functional programming in PythonFunctional programming in Python
Functional programming in Python
 
Scala kansai summit-2016
Scala kansai summit-2016Scala kansai summit-2016
Scala kansai summit-2016
 
Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...Laziness, trampolines, monoids and other functional amenities: this is not yo...
Laziness, trampolines, monoids and other functional amenities: this is not yo...
 
関数プログラミングことはじめ revival
関数プログラミングことはじめ revival関数プログラミングことはじめ revival
関数プログラミングことはじめ revival
 
Forward & Backward Differenece Table
Forward & Backward Differenece TableForward & Backward Differenece Table
Forward & Backward Differenece Table
 
Rのスコープとフレームと環境と
Rのスコープとフレームと環境とRのスコープとフレームと環境と
Rのスコープとフレームと環境と
 
Company_X_Data_Analyst_Challenge
Company_X_Data_Analyst_ChallengeCompany_X_Data_Analyst_Challenge
Company_X_Data_Analyst_Challenge
 
Functional perl
Functional perlFunctional perl
Functional perl
 
Data Visualization with R.ggplot2 and its extensions examples.
Data Visualization with R.ggplot2 and its extensions examples.Data Visualization with R.ggplot2 and its extensions examples.
Data Visualization with R.ggplot2 and its extensions examples.
 
Functional JS for everyone - 4Developers
Functional JS for everyone - 4DevelopersFunctional JS for everyone - 4Developers
Functional JS for everyone - 4Developers
 
ggtimeseries-->ggplot2 extensions
ggtimeseries-->ggplot2 extensions ggtimeseries-->ggplot2 extensions
ggtimeseries-->ggplot2 extensions
 
Javascript
JavascriptJavascript
Javascript
 
purrr.pdf
purrr.pdfpurrr.pdf
purrr.pdf
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
Function : Introduction
Function : IntroductionFunction : Introduction
Function : Introduction
 
Ch5b.ppt
Ch5b.pptCh5b.ppt
Ch5b.ppt
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 

More from betabeers

More from betabeers (20)

IONIC, el framework para crear aplicaciones híbridas multiplataforma
IONIC, el framework para crear aplicaciones híbridas multiplataformaIONIC, el framework para crear aplicaciones híbridas multiplataforma
IONIC, el framework para crear aplicaciones híbridas multiplataforma
 
Servicios de Gestión de Datos en la Nube - Jaime Balañá (NetApp)
Servicios de Gestión de Datos en la Nube - Jaime Balañá (NetApp)Servicios de Gestión de Datos en la Nube - Jaime Balañá (NetApp)
Servicios de Gestión de Datos en la Nube - Jaime Balañá (NetApp)
 
Blockchain: la revolución industrial de internet - Oscar Lage
Blockchain: la revolución industrial de internet - Oscar LageBlockchain: la revolución industrial de internet - Oscar Lage
Blockchain: la revolución industrial de internet - Oscar Lage
 
Cloud Learning: la formación del siglo XXI - Mónica Mediavilla
Cloud Learning: la formación del siglo XXI - Mónica MediavillaCloud Learning: la formación del siglo XXI - Mónica Mediavilla
Cloud Learning: la formación del siglo XXI - Mónica Mediavilla
 
Desarrollo web en Nodejs con Pillars por Chelo Quilón
Desarrollo web en Nodejs con Pillars por Chelo QuilónDesarrollo web en Nodejs con Pillars por Chelo Quilón
Desarrollo web en Nodejs con Pillars por Chelo Quilón
 
La línea recta hacia el éxito - Jon Torrado - Betabeers Bilbao
La línea recta hacia el éxito -  Jon Torrado - Betabeers BilbaoLa línea recta hacia el éxito -  Jon Torrado - Betabeers Bilbao
La línea recta hacia el éxito - Jon Torrado - Betabeers Bilbao
 
6 errores a evitar si eres una startup móvil y quieres evolucionar tu app
6 errores a evitar si eres una startup móvil y quieres evolucionar tu app6 errores a evitar si eres una startup móvil y quieres evolucionar tu app
6 errores a evitar si eres una startup móvil y quieres evolucionar tu app
 
Dev ops.continuous delivery - Ibon Landa (Plain Concepts)
Dev ops.continuous delivery - Ibon Landa (Plain Concepts)Dev ops.continuous delivery - Ibon Landa (Plain Concepts)
Dev ops.continuous delivery - Ibon Landa (Plain Concepts)
 
Introducción a scrum - Rodrigo Corral (Plain Concepts)
Introducción a scrum - Rodrigo Corral (Plain Concepts)Introducción a scrum - Rodrigo Corral (Plain Concepts)
Introducción a scrum - Rodrigo Corral (Plain Concepts)
 
Gestión de proyectos y consorcios internacionales - Iñigo Cañadas (GFI)
Gestión de proyectos y consorcios internacionales - Iñigo Cañadas (GFI)Gestión de proyectos y consorcios internacionales - Iñigo Cañadas (GFI)
Gestión de proyectos y consorcios internacionales - Iñigo Cañadas (GFI)
 
Software de gestión Open Source - Odoo - Bakartxo Aristegi (Aizean)
Software de gestión Open Source - Odoo - Bakartxo Aristegi (Aizean)Software de gestión Open Source - Odoo - Bakartxo Aristegi (Aizean)
Software de gestión Open Source - Odoo - Bakartxo Aristegi (Aizean)
 
Elemental, querido Watson - Caso de Uso
Elemental, querido Watson - Caso de UsoElemental, querido Watson - Caso de Uso
Elemental, querido Watson - Caso de Uso
 
Seguridad en tu startup
Seguridad en tu startupSeguridad en tu startup
Seguridad en tu startup
 
Spark Java: Aplicaciones web ligeras y rápidas con Java, por Fran Paredes.
Spark Java: Aplicaciones web ligeras y rápidas con Java, por Fran Paredes.Spark Java: Aplicaciones web ligeras y rápidas con Java, por Fran Paredes.
Spark Java: Aplicaciones web ligeras y rápidas con Java, por Fran Paredes.
 
Buenas prácticas para la optimización web
Buenas prácticas para la optimización webBuenas prácticas para la optimización web
Buenas prácticas para la optimización web
 
La magia de Scrum
La magia de ScrumLa magia de Scrum
La magia de Scrum
 
Programador++ por @wottam
Programador++ por @wottamProgramador++ por @wottam
Programador++ por @wottam
 
RaspberryPi: Tu dispositivo para IoT
RaspberryPi: Tu dispositivo para IoTRaspberryPi: Tu dispositivo para IoT
RaspberryPi: Tu dispositivo para IoT
 
Introducción al Big Data - Xabier Tranche - VIII Betabeers Bilbao 27/02/2015
 Introducción al Big Data - Xabier Tranche  - VIII Betabeers Bilbao 27/02/2015 Introducción al Big Data - Xabier Tranche  - VIII Betabeers Bilbao 27/02/2015
Introducción al Big Data - Xabier Tranche - VIII Betabeers Bilbao 27/02/2015
 
PAYTPV Plataforma Integral de Cobros - VIII Betabeers Bilbao 27/02/2015
PAYTPV Plataforma Integral de Cobros - VIII Betabeers Bilbao 27/02/2015PAYTPV Plataforma Integral de Cobros - VIII Betabeers Bilbao 27/02/2015
PAYTPV Plataforma Integral de Cobros - VIII Betabeers Bilbao 27/02/2015
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Mi primer map reduce