Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Hack reduce introduction

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
Big data – an introduction
Big data – an introduction
Cargando en…3
×

Eche un vistazo a continuación

1 de 15 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Hack reduce introduction (20)

Anuncio

Más de montrealouvert (20)

Hack reduce introduction

  1. 1. What is hack/reduce? • A Home for the Big Data Community • 24/7 Access to Cluster Compute Power • Regular Hackathons
  2. 2. hack/reduce 2011 Montreal Toronto Boston Ottawa 2012 hack/reduce Boston’s Big Data Hackspace
  3. 3. Why should you care? • Work with Millions and Billions of records • Find patterns in Big Data sets • Use data to detect, predict, forecast • Extract new information from raw data
  4. 4. APIs Suck In Big data there are: • no requests, • no predefined parameters • no structured responses. You are free to intersect anything with anything. You can analyse, mutate, group, split, reorder in any way you can imagine.
  5. 5. What you can do today • Access the hack/reduce GoGrid Cluster: • 240 Cores • 240GB of RAM • 10TB of Disk
  6. 6. What you can do today Use Hadoop to Explore big Open Data sets, like: • 20 Years of the Federal Parliament Hansard • Hourly Canadian Weather 1953 to 2001 • The 1881 Census. Details about 4.3M people • One Summer of Bixi Station Status Updates
  7. 7. What is Map/Reduce? • Framework for distributed computing on large data sets on clusters of computers • MapReduce patented by Google • Hadoop implementation is Googlesque • Michael Stonebraker hates it
  8. 8. What is Map/Reduce? • Map = function applied in parallel to every item in the dataset • Reduce = function applied in parallel to groups of values emitted by Map function
  9. 9. What is Map/Reduce? map(String docId, String document): for each word w in document: emit(w, 1); reduce(String word, Iterator counts): int sum = 0; for each count in counts: sum += count; emit(word, sum);
  10. 10. private key (“hackreduce”): http://bit.ly/X13pNh wiki: http://github.com/hackreduce/Hackathon SSH: ssh -i hackreduce hackreduce@cluster- MapReduce: http://cluster-1-master.gg.hackreduce

Notas del editor

  • We are hopper. Hopper is using Big Data to solve travel planning.
  • Hopper ’ s Montreal office was home to the inaugural Hack/Reduce event two years ago.
  • Hack/reduce is a community We held 4 events, in Montreal, Toronto, Boston and Ottawa. More than 300 hackers participated. Now we ’ re building a permanent Hack/Reduce community hackspace in Boston.
  • We are hopper. Hopper is using Big Data to solve travel planning.
  • GoGrid is sponsoring the cluster
  • GoGrid is sponsoring the cluster
  • If you ’ re interested in learning something different. Come talk to us.
  • If you ’ re interested in learning something different. Come talk to us.

×