Lec2 Mapred

Distributed Computing Seminar Lecture 2: MapReduce Theory and Implementation Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except as otherwise noted, the contents of this presentation are © Copyright 2007 University of Washington and licensed under the Creative Commons Attribution 2.5 License.

Outline ,[object Object],[object Object],[object Object],[object Object]

Functional Programming Review ,[object Object],[object Object],[object Object],[object Object]

Functional Programming Review ,[object Object],[object Object],[object Object]

Functional Updates Do Not Modify Structures ,[object Object],[object Object],[object Object],The append() function above reverses a list, adds a new element to the front, and returns all of that, reversed, which appends an item. But it never modifies lst !

Functions Can Be Used As Arguments ,[object Object],It does not matter what f does to its argument; DoDouble() will do it twice. What is the type of this function?

Map ,[object Object],[object Object]

Fold ,[object Object],[object Object]

fold left vs. fold right ,[object Object],[object Object],[object Object],SML Implementation: fun foldl f a [] = a | foldl f a (x::xs) = foldl f (f(x, a)) xs fun foldr f a [] = a | foldr f a (x::xs) = f(x, (foldr f a xs))

Example ,[object Object],[object Object],[object Object]

Example (Solved) ,[object Object],[object Object],[object Object],[object Object],[object Object]

A More Complicated Fold Problem ,[object Object],[object Object],[object Object]

A More Complicated Map Problem ,[object Object],[object Object]

map Implementation ,[object Object],[object Object],fun map f [] = [] | map f (x::xs) = (f x) :: (map f xs)

Implicit Parallelism In map ,[object Object],[object Object],[object Object]

Motivation: Large Scale Data Processing ,[object Object],[object Object],[object Object]

MapReduce ,[object Object],[object Object],[object Object],[object Object]

Programming Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

map ,[object Object],[object Object]

reduce ,[object Object],[object Object],[object Object]

Parallelism ,[object Object],[object Object],[object Object],[object Object]

Example: Count word occurrences map(String input_key, String input_value): // input_key: document name // input_value: document contents for each word w in input_value: EmitIntermediate (w, "1"); reduce(String output_key, Iterator intermediate_values): // output_key: a word // output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit (AsString(result));

Example vs. Actual Source Code ,[object Object],[object Object],[object Object],[object Object]

Locality ,[object Object],[object Object]

Fault Tolerance ,[object Object],[object Object],[object Object],[object Object],[object Object]

Optimizations ,[object Object],[object Object],[object Object],Why is it safe to redundantly execute map tasks? Wouldn’t this mess up the total computation?

Optimizations ,[object Object],[object Object],Under what conditions is it sound to use a combiner?

MapReduce Conclusions ,[object Object],[object Object],[object Object],[object Object]

Lec2 Mapred

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (9)

Similar a Lec2 Mapred

Similar a Lec2 Mapred (20)

Más de mobius.cn

Más de mobius.cn (6)

Lec2 Mapred