Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

“WordCount”, a “Hello World” for Getting Started on Hadoop

20.458 visualizaciones

Publicado el

“WordCount”, a “Hello World” for MapReduce

Definition: count how often each word appears within a collection
of text documents.

A simple program which illustrates a pretty good test case for what
MapReduce can perform, since it incorporates:

• minimal amount code
• document feature extraction (where words are “terms”)
• symbolic and numeric values
• potential use of a combiner
• bipartite graph of (doc, term) tuples
• not so many steps away from useful indexing…
When a framework can run “WordCount” in parallel at scale, then it
can handle much larger, more interesting compute problems as well.

Publicado en: Tecnología