SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
The Elephant In the Room: MongoDB + Hadoop

                         Brendan W. McAdams

                                    10gen, Inc.


              Mongo Chicago Conference - October 2010




                                                               Mongo Chicago Conference - October 2010
B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Outline




1   Introduction
       Why Integrate?
       MongoDB + MapReduce in Java
       Using Pig for High Level Data Analysis




                                                                   Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Hadoop Explained. . .

    Started in February 2006 as part of the Apache Lucene project
    Based upon Google’s MapReduce and GFS Papers
    Allows distributed, scalable data processing of huge datasets
    Java based, but with support for other JVM and Non-JVM
    Languages
    Lots of ecosystem tools to simplify interaction such as Pig and
    Hive
    In use at New York Times, Last.fm, Yahoo!, Amazon, Facebook
    and many more companies. . .
    Great tools for temporary Hadoop clusters such as the Cloudera
    Cluster Tools, Apache Whirr and Amazon’s Elastic MapReduce.



                                                                  Mongo Chicago Conference - October 2010
   B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Why Integrate MongoDB?



   Easiest Answer: People keep asking for it . . .
   Limitations in MongoDB’s built in MapReduce
         Language: JavaScript only; not everyone wants to write JavaScript
          for data processing.
         Concurrency: Current JS Implementation is limited to one JS
          execution per server at a time.
         Scalability: Not a lot of ability to scale MongoDB’s MapReduce
          except in cases of sharding.




                                                                 Mongo Chicago Conference - October 2010
  B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Why Integrate MongoDB?



   Easiest Answer: People keep asking for it . . .
   Limitations in MongoDB’s built in MapReduce
         Language: JavaScript only; not everyone wants to write JavaScript
          for data processing.
         Concurrency: Current JS Implementation is limited to one JS
          execution per server at a time.
         Scalability: Not a lot of ability to scale MongoDB’s MapReduce
          except in cases of sharding.




                                                                 Mongo Chicago Conference - October 2010
  B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Why Integrate MongoDB?



   Easiest Answer: People keep asking for it . . .
   Limitations in MongoDB’s built in MapReduce
         Language: JavaScript only; not everyone wants to write JavaScript
          for data processing.
         Concurrency: Current JS Implementation is limited to one JS
          execution per server at a time.
         Scalability: Not a lot of ability to scale MongoDB’s MapReduce
          except in cases of sharding.




                                                                 Mongo Chicago Conference - October 2010
  B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Why Integrate MongoDB?



   Easiest Answer: People keep asking for it . . .
   Limitations in MongoDB’s built in MapReduce
         Language: JavaScript only; not everyone wants to write JavaScript
          for data processing.
         Concurrency: Current JS Implementation is limited to one JS
          execution per server at a time.
         Scalability: Not a lot of ability to scale MongoDB’s MapReduce
          except in cases of sharding.




                                                                 Mongo Chicago Conference - October 2010
  B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
Mongo + Hadoop . . .


    Integrating MongoDB and Hadoop to read  write data from/to
    MongoDB via Hadoop
    10gen has been working on a plugin to integrate the two systems,
    written in Pure Java
    About 6 months ago I explored the idea in Scala with a project
    called ‘Luau’
    Support for pure MapReduce as well as Pig (Currently output only
    - input coming soon)
    With Hadoop Streaming (soon), write your MapReduce in Python
    or Ruby




                                                                  Mongo Chicago Conference - October 2010
   B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17
MongoDB + MapReduce in Java I
// WordCount.java

import java.io.*;
import java.util.*;

import org.bson.*;
import com.mongodb.*;
import com.mongodb.hadoop.*;

import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;

/**
 * test.in
 db.in.insert( { x : eliot was here } )
 db.in.insert( { x : eliot is here } )
 db.in.insert( { x : who is here } )
  =
 */
public class WordCount {


   public static class TokenizerMapper extends MapperObject, BSONObject, Text,
     IntWritable{

      private final static IntWritable one = new IntWritable(1);
      private Text word = new Text();

      public void map(Object key, BSONObject value, Context context )
          throws IOException, InterruptedException {

                                                                       Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)       The Elephant In the Room: MongoDB + Hadoop                                  / 17
MongoDB + MapReduce in Java II
              System.out.println( key:  + key );
              System.out.println( value:  + value );

              StringTokenizer itr = new StringTokenizer(value.get( x ).toString());
              while (itr.hasMoreTokens()) {
                  word.set(itr.nextToken());
                  context.write(word, one);
              }
      }
  }


  public static class IntSumReducer         extends ReducerText,IntWritable,Text,IntWritable {

          private IntWritable result = new IntWritable();

          public void reduce(Text key, IterableIntWritable values,           Context context )
              throws IOException, InterruptedException {

               int sum = 0;
               for (IntWritable val : values) {
                   sum += val.get();
               }
               result.set(sum);
               context.write(key, result);
          }
  }

  public static void main(String[] args)
      throws Exception {

                                                                         Mongo Chicago Conference - October 2010
  B.W. McAdams (10gen)           The Elephant In the Room: MongoDB + Hadoop                                  / 17
MongoDB + MapReduce in Java III


        Configuration conf = new Configuration();
        conf.set( MONGO_INPUT , mongodb://localhost/test.in );
        conf.set( MONGO_OUTPUT , mongodb://localhost/test.out );

        Job job = new Job(conf, word count);

        job.setJarByClass(WordCount.class);

        job.setMapperClass(TokenizerMapper.class);

        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        job.setInputFormatClass( MongoInputFormat.class );
        job.setOutputFormatClass( MongoOutputFormat.class );

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}




                                                                     Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)     The Elephant In the Room: MongoDB + Hadoop                                  / 17
The Results I

 db.out.find()
{ _id : talking, value : 1 }
{ _id : tender,, value : 1 }
{ _id : tender., value : 1 }
{ _id : them--, value : 1 }
{ _id : there’s, value : 3 }
{ _id : there,, value : 1 }
{ _id : there., value : 1 }
{ _id : thing,, value : 1 }
{ _id : thing?, value : 1 }
{ _id : things, value : 3 }
{ _id : things., value : 3 }
{ _id : think, value : 5 }
{ _id : thirsty,, value : 1 }
{ _id : this, value : 7 }
{ _id : this., value : 1 }
{ _id : those, value : 1 }
{ _id : thought, value : 1 }
{ _id : threaded, value : 1 }
{ _id : throw, value : 1 }
{ _id : tidy., value : 2 }
has more




                                                                     Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)     The Elephant In the Room: MongoDB + Hadoop                                  / 17
Sample Input Data I
2A9EABFB35F5B954     970916105432    +md foods +proteins
BED75271605EBD0C     970916001949    yahoo chat
BED75271605EBD0C     970916001954    yahoo chat
BED75271605EBD0C     970916003523    yahoo chat
BED75271605EBD0C     970916011322    yahoo search
BED75271605EBD0C     970916011404    yahoo chat
BED75271605EBD0C     970916011422    yahoo chat
BED75271605EBD0C     970916012756    yahoo caht
BED75271605EBD0C     970916012816    yahoo chat
BED75271605EBD0C     970916023603    yahoo chat
BED75271605EBD0C     970916025458    yahoo caht
BED75271605EBD0C     970916025516    yahoo chat
BED75271605EBD0C     970916030348    yahoo chat
BED75271605EBD0C     970916034807    yahoo chat
BED75271605EBD0C     970916040755    yahoo chat
BED75271605EBD0C     970916090700    hawaii chat universe
BED75271605EBD0C     970916094445    yahoo chat
BED75271605EBD0C     970916191427    yahoo chat
BED75271605EBD0C     970916201045    yahoo chat
BED75271605EBD0C     970916201050    yahoo chat
BED75271605EBD0C     970916201927    yahoo chat
824F413FA37520BF     970916184809    garter belts
824F413FA37520BF     970916184818    garter belts
824F413FA37520BF     970916184939    lingerie
824F413FA37520BF     970916185051    spiderman
824F413FA37520BF     970916185155    tommy hilfiger
824F413FA37520BF     970916185257    calgary
824F413FA37520BF     970916185513    calgary
824F413FA37520BF     970916185605    exhibitionists
824F413FA37520BF     970916190220    exhibitionists
824F413FA37520BF     970916191233    exhibitionists

                                                                      Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)      The Elephant In the Room: MongoDB + Hadoop                                  / 17
Sample Input Data II


7A8D9CFC957C7FCA     970916064707    duron paint
7A8D9CFC957C7FCA     970916064731    duron paint
A25C8C765238184A     970916103534    brookings
A25C8C765238184A     970916104751    breton liberation front
A25C8C765238184A     970916105238    breton
A25C8C765238184A     970916105322    breton liberation front
A25C8C765238184A     970916105539    breton
A25C8C765238184A     970916105628    breton
A25C8C765238184A     970916105723    front de liberation de la bretagne
A25C8C765238184A     970916105857    front de liberation de la bretagne
6589F4342B215FD4     970916125147    afghanistan
6589F4342B215FD4     970916125158    afghanistan
6589F4342B215FD4     970916125407    afghanistan
16DE160B4FFE3B85     970916050356    eia rs232-c
16DE160B4FFE3B85     970916050645    nullmodem
16DE160B4FFE3B85     970916050807    nullmodem
563FC9A7E8A9022A     970916042826    organizational chart
563FC9A7E8A9022A     970916221456    organizational chart of uae’s companies
563FC9A7E8A9022A     970916221611    organizational chart of dubai dutiy free




                                                                      Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)      The Elephant In the Room: MongoDB + Hadoop                                  / 17
The Pig Script I
-- Query Phrase Popularity (local mode)

-- This script processes a search query log file from the Excite search engine and finds
      search phrases that occur with particular high frequency during certain times of the
      day.

-- Register the tutorial JAR file so that the included UDFs can be called in the script.

-- Based   on the Pig tutorial ,modified for Mongo support tests
REGISTER   src/examples/pigtutorial.jar;
REGISTER   mongo-hadoop.jar;
REGISTER   lib/mongo-java-driver.jar;

-- Use the PigStorage function to load the excite log file into the raw bag as an array of
      records.
-- Input: (user,time,query)
raw = LOAD ’excite-small.log’ USING PigStorage(’t’) AS (user, time, query);

-- Call the NonURLDetector UDF to remove records if the query field is empty or a URL.
clean1 = FILTER raw BY org.apache.pig.tutorial.NonURLDetector(query);

-- Call the ToLower UDF to change the query field to lowercase.
clean2 = FOREACH clean1 GENERATE user, time, org.apache.pig.tutorial.ToLower(query) as query;

-- Because the log file    only contains queries for a single day, we are only interested in
      the hour.
-- The excite query log    timestamp format is YYMMDDHHMMSS.
-- Call the ExtractHour    UDF to extract the hour (HH) from the time field.
houred = FOREACH clean2    GENERATE user, org.apache.pig.tutorial.ExtractHour(time) as hour,
      query;


                                                                        Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)        The Elephant In the Room: MongoDB + Hadoop                                  / 17
The Pig Script II
-- Call the NGramGenerator UDF to compose the n-grams of the query.
ngramed1 = FOREACH houred GENERATE user, hour,
      flatten(org.apache.pig.tutorial.NGramGenerator(query)) as ngram;

-- Use the DISTINCT command to get the unique n-grams for all records.
ngramed2 = DISTINCT ngramed1;

-- Use the GROUP command to group records by n-gram and hour.
hour_frequency1 = GROUP ngramed2 BY (ngram, hour);

-- Use the COUNT function to get the count (occurrences) of each n-gram.
hour_frequency2 = FOREACH hour_frequency1 GENERATE flatten($0), COUNT($1) as count;

-- Use the GROUP command to group records by n-gram only.
-- Each group now corresponds to a distinct n-gram and has the count for each hour.
uniq_frequency1 = GROUP hour_frequency2 BY group::ngram;

-- For each group, identify the hour in which this n-gram is used with a particularly high
      frequency.
-- Call the ScoreGenerator UDF to calculate a popularity score for the n-gram.
uniq_frequency2 = FOREACH uniq_frequency1 GENERATE flatten($0),
      flatten(org.apache.pig.tutorial.ScoreGenerator($1));

-- Use the FOREACH-GENERATE command to assign names to the fields.
uniq_frequency3 = FOREACH uniq_frequency2 GENERATE $1 as hour, $0 as ngram, $2 as score, $3
      as count, $4 as mean;

-- Use the FILTER command to move all records with a score less than or equal to 2.0.
filtered_uniq_frequency = FILTER uniq_frequency3 BY score  2.0;

-- Use the ORDER command to sort the remaining records by hour and score.

                                                                      Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)      The Elephant In the Room: MongoDB + Hadoop                                  / 17
The Pig Script III




ordered_uniq_frequency = ORDER filtered_uniq_frequency BY hour, score;

-- Use the PigStorage function to store the results.
-- Output: (hour, n-gram, score, count, average_counts_among_all_hours)
STORE ordered_uniq_frequency INTO ’mongodb://localhost/test.pig.output’ USING
      com.mongodb.hadoop.pig.MongoStorage;




                                                                     Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)     The Elephant In the Room: MongoDB + Hadoop                                  / 17
The Output I
 db.pig.output.find()
{ _id : ObjectId(4cbbbd9487e836b7e0dc1052), hour : 07, ngram : new, score :
      2.4494897427831788, count : NumberLong(2), mean : 1.1428571428571426 }
{ _id : ObjectId(4cbbbd9487e836b7e1dc1052), hour : 08, ngram : pictures, score
      : 2.04939015319192, count : NumberLong(3), mean : 1.4999999999999998 }
{ _id : ObjectId(4cbbbd9487e836b7e2dc1052), hour : 08, ngram : computer, score
      : 2.4494897427831788, count : NumberLong(2), mean : 1.1428571428571426 }
{ _id : ObjectId(4cbbbd9487e836b7e3dc1052), hour : 08, ngram : s, score :
      2.545584412271571, count : NumberLong(3), mean : 1.3636363636363635 }
{ _id : ObjectId(4cbbbd9487e836b7e4dc1052), hour : 10, ngram : free, score :
      2.2657896674010605, count : NumberLong(4), mean : 1.923076923076923 }
{ _id : ObjectId(4cbbbd9487e836b7e5dc1052), hour : 10, ngram : to, score :
      2.6457513110645903, count : NumberLong(2), mean : 1.125 }
{ _id : ObjectId(4cbbbd9487e836b7e6dc1052), hour : 10, ngram : pics, score :
      2.794002794004192, count : NumberLong(3), mean : 1.3076923076923075 }
{ _id : ObjectId(4cbbbd9487e836b7e7dc1052), hour : 10, ngram : school, score :
      2.828427124746189, count : NumberLong(2), mean : 1.1111111111111114 }
{ _id : ObjectId(4cbbbd9487e836b7e8dc1052), hour : 11, ngram : pictures, score
      : 2.04939015319192, count : NumberLong(3), mean : 1.4999999999999998 }
{ _id : ObjectId(4cbbbd9487e836b7e9dc1052), hour : 11, ngram : in, score :
      2.1572774865200244, count : NumberLong(3), mean : 1.4285714285714284 }
{ _id : ObjectId(4cbbbd9487e836b7eadc1052), hour : 13, ngram : the, score :
      3.1309398305840723, count : NumberLong(6), mean : 1.9375 }
{ _id : ObjectId(4cbbbd9487e836b7ebdc1052), hour : 14, ngram : music, score :
      2.1105794120443453, count : NumberLong(4), mean : 1.6666666666666667 }
{ _id : ObjectId(4cbbbd9487e836b7ecdc1052), hour : 14, ngram : city, score :
      2.2360679774997902, count : NumberLong(2), mean : 1.1666666666666665 }
{ _id : ObjectId(4cbbbd9487e836b7eddc1052), hour : 14, ngram : university,
      score : 2.412090756622109, count : NumberLong(3), mean : 1.4000000000000001 }
{ _id : ObjectId(4cbbbd9487e836b7eedc1052), hour : 15, ngram : adult, score :
      2.8284271247461903, count : NumberLong(2), mean : 1.1111111111111112 }

                                                                     Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)     The Elephant In the Room: MongoDB + Hadoop                                  / 17
The Output II



{ _id : ObjectId(4cbbbd9487e836b7efdc1052), hour : 17, ngram : chat, score :
      2.9104275004359965, count : NumberLong(3), mean : 1.2857142857142854 }
{ _id : ObjectId(4cbbbd9487e836b7f0dc1052), hour : 19, ngram : in, score :
      2.1572774865200244, count : NumberLong(3), mean : 1.4285714285714284 }
{ _id : ObjectId(4cbbbd9487e836b7f1dc1052), hour : 19, ngram : car, score :
      2.23606797749979, count : NumberLong(3), mean : 1.3333333333333333 }
{ _id : ObjectId(4cbbcf8194b936b7073ae8cb), hour : 07, ngram : new, score :
      2.4494897427831788, count : NumberLong(2), mean : 1.1428571428571426 }
{ _id : ObjectId(4cbbcf8194b936b7083ae8cb), hour : 08, ngram : pictures, score
      : 2.04939015319192, count : NumberLong(3), mean : 1.4999999999999998 }
has more




                                                                     Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)     The Elephant In the Room: MongoDB + Hadoop                                  / 17
Questions?
Contact Info




      Contact Me
           twitter: @rit
           email: brendan@10gen.com
           mongo-hadoop . . . Available Soon in Alpha Form:
            http://github.com/mongodb/mongo-hadoop
      Pressing Questions?
           IRC - freenode.net #mongodb
           MongoDB Users List -
            http://groups.google.com/group/mongodb-user




                                                                   Mongo Chicago Conference - October 2010
    B.W. McAdams (10gen)   The Elephant In the Room: MongoDB + Hadoop                                  / 17

Más contenido relacionado

La actualidad más candente

Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Tobias Trelle
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014MongoDB
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know Norberto Leite
 
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...MongoDB
 
Webinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsWebinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsMongoDB
 
Social Analytics with MongoDB
Social Analytics with MongoDBSocial Analytics with MongoDB
Social Analytics with MongoDBPatrick Stokes
 
Java Development with MongoDB
Java Development with MongoDBJava Development with MongoDB
Java Development with MongoDBScott Hernandez
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...MongoDB
 
Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...
Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...
Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...Artem Romanchik
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
 
MongoDB: Intro & Application for Big Data
MongoDB: Intro & Application  for Big DataMongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application for Big DataTakahiro Inoue
 
Omnibus database machine
Omnibus database machineOmnibus database machine
Omnibus database machineAleck Landgraf
 
Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBScaleGrid.io
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
Real World Application Performance with MongoDB
Real World Application Performance with MongoDBReal World Application Performance with MongoDB
Real World Application Performance with MongoDBMongoDB
 

La actualidad más candente (20)

Android GRPC
Android GRPCAndroid GRPC
Android GRPC
 
Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Morphia, Spring Data & Co.
Morphia, Spring Data & Co.
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know
 
A Brief MongoDB Intro
A Brief MongoDB IntroA Brief MongoDB Intro
A Brief MongoDB Intro
 
Mining legal texts with Python
Mining legal texts with PythonMining legal texts with Python
Mining legal texts with Python
 
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
MongoDB.local DC 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch Applic...
 
Webinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsWebinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to Basics
 
MongoDB - Ekino PHP
MongoDB - Ekino PHPMongoDB - Ekino PHP
MongoDB - Ekino PHP
 
Social Analytics with MongoDB
Social Analytics with MongoDBSocial Analytics with MongoDB
Social Analytics with MongoDB
 
Java Development with MongoDB
Java Development with MongoDBJava Development with MongoDB
Java Development with MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...
Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...
Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in...
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
 
MongoDB: Intro & Application for Big Data
MongoDB: Intro & Application  for Big DataMongoDB: Intro & Application  for Big Data
MongoDB: Intro & Application for Big Data
 
Omnibus database machine
Omnibus database machineOmnibus database machine
Omnibus database machine
 
Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDB
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
Real World Application Performance with MongoDB
Real World Application Performance with MongoDBReal World Application Performance with MongoDB
Real World Application Performance with MongoDB
 

Destacado

The elephant in the room. discussion
The elephant in the room. discussionThe elephant in the room. discussion
The elephant in the room. discussionAndrew Gelston
 
asteRISK
asteRISKasteRISK
asteRISKkrnmcg
 
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCElephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCMike Lewis
 
Moving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleMoving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleTyrone Hinderson
 
Risk: the Elephant in the Room
Risk: the Elephant in the RoomRisk: the Elephant in the Room
Risk: the Elephant in the RoomLast Call Media
 
Strategy - The elephant in the room
Strategy - The elephant in the roomStrategy - The elephant in the room
Strategy - The elephant in the roomIIBA UK Chapter
 
Addressing the Elephant in the Room - Content Strategy
Addressing the Elephant in the Room - Content StrategyAddressing the Elephant in the Room - Content Strategy
Addressing the Elephant in the Room - Content StrategyRay Killebrew
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterRobert H. McDonald
 
CMS Expo 2011 Keynote - The Elephant in the Room
CMS Expo 2011 Keynote - The Elephant in the RoomCMS Expo 2011 Keynote - The Elephant in the Room
CMS Expo 2011 Keynote - The Elephant in the RoomScott Liewehr
 
Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...
Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...
Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...ad:tech London, MMS & iMedia
 
The elephant in the room
The elephant in the roomThe elephant in the room
The elephant in the roomJohn Gillis
 
The Elephant In The Room - Research Report 31 July 2013
The Elephant In The Room - Research Report 31 July 2013The Elephant In The Room - Research Report 31 July 2013
The Elephant In The Room - Research Report 31 July 2013Howard Cooke
 
RIDE 2011: Student dropout – the elephant in the room of distance education (...
RIDE 2011: Student dropout – the elephant in the room of distance education (...RIDE 2011: Student dropout – the elephant in the room of distance education (...
RIDE 2011: Student dropout – the elephant in the room of distance education (...Centre for Distance Education
 
Kanban. Dealing with the elephant in the room. One chunk at a time
Kanban. Dealing with the elephant in the room. One chunk at a timeKanban. Dealing with the elephant in the room. One chunk at a time
Kanban. Dealing with the elephant in the room. One chunk at a timejsonnevelt
 
How to Tame the Elephant in the Room- 6 steps to build trust and close deals!
How to Tame the Elephant in the Room- 6 steps to build trust and close deals!How to Tame the Elephant in the Room- 6 steps to build trust and close deals!
How to Tame the Elephant in the Room- 6 steps to build trust and close deals!Mitch Jackson
 

Destacado (20)

The elephant in the room. discussion
The elephant in the room. discussionThe elephant in the room. discussion
The elephant in the room. discussion
 
asteRISK
asteRISKasteRISK
asteRISK
 
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYCElephant in the Room: Social Media ROI - WEB 2.0 NYC
Elephant in the Room: Social Media ROI - WEB 2.0 NYC
 
Elephant in Room Version 2
Elephant in Room Version 2Elephant in Room Version 2
Elephant in Room Version 2
 
Moving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at ScaleMoving the Elephant in the Room: Data Migration at Scale
Moving the Elephant in the Room: Data Migration at Scale
 
YUI The Elephant In The Room
YUI The Elephant In The RoomYUI The Elephant In The Room
YUI The Elephant In The Room
 
Risk: the Elephant in the Room
Risk: the Elephant in the RoomRisk: the Elephant in the Room
Risk: the Elephant in the Room
 
Strategy - The elephant in the room
Strategy - The elephant in the roomStrategy - The elephant in the room
Strategy - The elephant in the room
 
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOMELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
ELEARNING IN ART AND DESIGN: THE ELEPHANT IN THE ROOM
 
The elephant in the room
The elephant in the roomThe elephant in the room
The elephant in the room
 
Addressing the Elephant in the Room - Content Strategy
Addressing the Elephant in the Room - Content StrategyAddressing the Elephant in the Room - Content Strategy
Addressing the Elephant in the Room - Content Strategy
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
 
CMS Expo 2011 Keynote - The Elephant in the Room
CMS Expo 2011 Keynote - The Elephant in the RoomCMS Expo 2011 Keynote - The Elephant in the Room
CMS Expo 2011 Keynote - The Elephant in the Room
 
Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...
Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...
Lance Concannon, Sysomos: Simplifiying social - How marketers can manage the ...
 
G!
G!G!
G!
 
The elephant in the room
The elephant in the roomThe elephant in the room
The elephant in the room
 
The Elephant In The Room - Research Report 31 July 2013
The Elephant In The Room - Research Report 31 July 2013The Elephant In The Room - Research Report 31 July 2013
The Elephant In The Room - Research Report 31 July 2013
 
RIDE 2011: Student dropout – the elephant in the room of distance education (...
RIDE 2011: Student dropout – the elephant in the room of distance education (...RIDE 2011: Student dropout – the elephant in the room of distance education (...
RIDE 2011: Student dropout – the elephant in the room of distance education (...
 
Kanban. Dealing with the elephant in the room. One chunk at a time
Kanban. Dealing with the elephant in the room. One chunk at a timeKanban. Dealing with the elephant in the room. One chunk at a time
Kanban. Dealing with the elephant in the room. One chunk at a time
 
How to Tame the Elephant in the Room- 6 steps to build trust and close deals!
How to Tame the Elephant in the Room- 6 steps to build trust and close deals!How to Tame the Elephant in the Room- 6 steps to build trust and close deals!
How to Tame the Elephant in the Room- 6 steps to build trust and close deals!
 

Similar a The elephant in the room mongo db + hadoop

Spring Data MongoDB 介紹
Spring Data MongoDB 介紹Spring Data MongoDB 介紹
Spring Data MongoDB 介紹Kuo-Chun Su
 
Introduction to using MongoDB with Ruby
Introduction to using MongoDB with RubyIntroduction to using MongoDB with Ruby
Introduction to using MongoDB with RubyJonathan Holloway
 
Mango Database - Web Development
Mango Database - Web DevelopmentMango Database - Web Development
Mango Database - Web Developmentmssaman
 
Analytical data processing
Analytical data processingAnalytical data processing
Analytical data processingPolad Saruxanov
 
Mongo db bangalore
Mongo db bangaloreMongo db bangalore
Mongo db bangaloreMongoDB
 
Node Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialNode Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialPHP Support
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentationHyphen Call
 
MongoDB & Hadoop, Sittin' in a Tree
MongoDB & Hadoop, Sittin' in a TreeMongoDB & Hadoop, Sittin' in a Tree
MongoDB & Hadoop, Sittin' in a TreeMongoDB
 
Spatial mongo for PHP and Zend
Spatial mongo for PHP and ZendSpatial mongo for PHP and Zend
Spatial mongo for PHP and ZendSteven Pousty
 
Using MongoDB For BigData in 20 Minutes
Using MongoDB For BigData in 20 MinutesUsing MongoDB For BigData in 20 Minutes
Using MongoDB For BigData in 20 MinutesAndrás Fehér
 
Webinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsWebinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsMongoDB
 
Webinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsWebinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsMongoDB
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB
 
MongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & ExecutionMongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & ExecutioniFour Technolab Pvt. Ltd.
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRick Copeland
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...IndicThreads
 
Bringing spatial love to your python application
Bringing spatial love to your python applicationBringing spatial love to your python application
Bringing spatial love to your python applicationShekhar Gulati
 

Similar a The elephant in the room mongo db + hadoop (20)

Spring Data MongoDB 介紹
Spring Data MongoDB 介紹Spring Data MongoDB 介紹
Spring Data MongoDB 介紹
 
Introduction to using MongoDB with Ruby
Introduction to using MongoDB with RubyIntroduction to using MongoDB with Ruby
Introduction to using MongoDB with Ruby
 
Mango Database - Web Development
Mango Database - Web DevelopmentMango Database - Web Development
Mango Database - Web Development
 
Analytical data processing
Analytical data processingAnalytical data processing
Analytical data processing
 
Mongo db bangalore
Mongo db bangaloreMongo db bangalore
Mongo db bangalore
 
MongoDB Part 2
MongoDB Part 2MongoDB Part 2
MongoDB Part 2
 
Node Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialNode Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js Tutorial
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
MongoDB & Hadoop, Sittin' in a Tree
MongoDB & Hadoop, Sittin' in a TreeMongoDB & Hadoop, Sittin' in a Tree
MongoDB & Hadoop, Sittin' in a Tree
 
MongoDB and Node.js
MongoDB and Node.jsMongoDB and Node.js
MongoDB and Node.js
 
Spatial mongo for PHP and Zend
Spatial mongo for PHP and ZendSpatial mongo for PHP and Zend
Spatial mongo for PHP and Zend
 
Using MongoDB For BigData in 20 Minutes
Using MongoDB For BigData in 20 MinutesUsing MongoDB For BigData in 20 Minutes
Using MongoDB For BigData in 20 Minutes
 
Webinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsWebinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.js
 
Webinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsWebinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.js
 
Mongo db report
Mongo db reportMongo db report
Mongo db report
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
 
MongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & ExecutionMongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & Execution
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and Python
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
 
Bringing spatial love to your python application
Bringing spatial love to your python applicationBringing spatial love to your python application
Bringing spatial love to your python application
 

Más de iammutex

Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagramiammutex
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出iammutex
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redisiammutex
 
NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析iammutex
 
MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用iammutex
 
8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slide8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slideiammutex
 
Thoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency ModelsThoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency Modelsiammutex
 
Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告iammutex
 
redis 适用场景与实现
redis 适用场景与实现redis 适用场景与实现
redis 适用场景与实现iammutex
 
Introduction to couchdb
Introduction to couchdbIntroduction to couchdb
Introduction to couchdbiammutex
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disksiammutex
 
redis运维之道
redis运维之道redis运维之道
redis运维之道iammutex
 
Realtime hadoopsigmod2011
Realtime hadoopsigmod2011Realtime hadoopsigmod2011
Realtime hadoopsigmod2011iammutex
 
[译]No sql生态系统
[译]No sql生态系统[译]No sql生态系统
[译]No sql生态系统iammutex
 
Couchdb + Membase = Couchbase
Couchdb + Membase = CouchbaseCouchdb + Membase = Couchbase
Couchdb + Membase = Couchbaseiammutex
 
Redis cluster
Redis clusterRedis cluster
Redis clusteriammutex
 
Redis cluster
Redis clusterRedis cluster
Redis clusteriammutex
 

Más de iammutex (20)

Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redis
 
NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析
 
MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用
 
8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slide8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slide
 
skip list
skip listskip list
skip list
 
Thoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency ModelsThoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency Models
 
Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告
 
redis 适用场景与实现
redis 适用场景与实现redis 适用场景与实现
redis 适用场景与实现
 
Introduction to couchdb
Introduction to couchdbIntroduction to couchdb
Introduction to couchdb
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disks
 
Ooredis
OoredisOoredis
Ooredis
 
Ooredis
OoredisOoredis
Ooredis
 
redis运维之道
redis运维之道redis运维之道
redis运维之道
 
Realtime hadoopsigmod2011
Realtime hadoopsigmod2011Realtime hadoopsigmod2011
Realtime hadoopsigmod2011
 
[译]No sql生态系统
[译]No sql生态系统[译]No sql生态系统
[译]No sql生态系统
 
Couchdb + Membase = Couchbase
Couchdb + Membase = CouchbaseCouchdb + Membase = Couchbase
Couchdb + Membase = Couchbase
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 

Último

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 

Último (20)

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 

The elephant in the room mongo db + hadoop

  • 1. The Elephant In the Room: MongoDB + Hadoop Brendan W. McAdams 10gen, Inc. Mongo Chicago Conference - October 2010 Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 2. Outline 1 Introduction Why Integrate? MongoDB + MapReduce in Java Using Pig for High Level Data Analysis Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 3. Hadoop Explained. . . Started in February 2006 as part of the Apache Lucene project Based upon Google’s MapReduce and GFS Papers Allows distributed, scalable data processing of huge datasets Java based, but with support for other JVM and Non-JVM Languages Lots of ecosystem tools to simplify interaction such as Pig and Hive In use at New York Times, Last.fm, Yahoo!, Amazon, Facebook and many more companies. . . Great tools for temporary Hadoop clusters such as the Cloudera Cluster Tools, Apache Whirr and Amazon’s Elastic MapReduce. Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 4. Why Integrate MongoDB? Easiest Answer: People keep asking for it . . . Limitations in MongoDB’s built in MapReduce Language: JavaScript only; not everyone wants to write JavaScript for data processing. Concurrency: Current JS Implementation is limited to one JS execution per server at a time. Scalability: Not a lot of ability to scale MongoDB’s MapReduce except in cases of sharding. Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 5. Why Integrate MongoDB? Easiest Answer: People keep asking for it . . . Limitations in MongoDB’s built in MapReduce Language: JavaScript only; not everyone wants to write JavaScript for data processing. Concurrency: Current JS Implementation is limited to one JS execution per server at a time. Scalability: Not a lot of ability to scale MongoDB’s MapReduce except in cases of sharding. Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 6. Why Integrate MongoDB? Easiest Answer: People keep asking for it . . . Limitations in MongoDB’s built in MapReduce Language: JavaScript only; not everyone wants to write JavaScript for data processing. Concurrency: Current JS Implementation is limited to one JS execution per server at a time. Scalability: Not a lot of ability to scale MongoDB’s MapReduce except in cases of sharding. Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 7. Why Integrate MongoDB? Easiest Answer: People keep asking for it . . . Limitations in MongoDB’s built in MapReduce Language: JavaScript only; not everyone wants to write JavaScript for data processing. Concurrency: Current JS Implementation is limited to one JS execution per server at a time. Scalability: Not a lot of ability to scale MongoDB’s MapReduce except in cases of sharding. Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 8. Mongo + Hadoop . . . Integrating MongoDB and Hadoop to read write data from/to MongoDB via Hadoop 10gen has been working on a plugin to integrate the two systems, written in Pure Java About 6 months ago I explored the idea in Scala with a project called ‘Luau’ Support for pure MapReduce as well as Pig (Currently output only - input coming soon) With Hadoop Streaming (soon), write your MapReduce in Python or Ruby Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 9. MongoDB + MapReduce in Java I // WordCount.java import java.io.*; import java.util.*; import org.bson.*; import com.mongodb.*; import com.mongodb.hadoop.*; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; /** * test.in db.in.insert( { x : eliot was here } ) db.in.insert( { x : eliot is here } ) db.in.insert( { x : who is here } ) = */ public class WordCount { public static class TokenizerMapper extends MapperObject, BSONObject, Text, IntWritable{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, BSONObject value, Context context ) throws IOException, InterruptedException { Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 10. MongoDB + MapReduce in Java II System.out.println( key: + key ); System.out.println( value: + value ); StringTokenizer itr = new StringTokenizer(value.get( x ).toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends ReducerText,IntWritable,Text,IntWritable { private IntWritable result = new IntWritable(); public void reduce(Text key, IterableIntWritable values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 11. MongoDB + MapReduce in Java III Configuration conf = new Configuration(); conf.set( MONGO_INPUT , mongodb://localhost/test.in ); conf.set( MONGO_OUTPUT , mongodb://localhost/test.out ); Job job = new Job(conf, word count); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setInputFormatClass( MongoInputFormat.class ); job.setOutputFormatClass( MongoOutputFormat.class ); System.exit(job.waitForCompletion(true) ? 0 : 1); } } Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 12. The Results I db.out.find() { _id : talking, value : 1 } { _id : tender,, value : 1 } { _id : tender., value : 1 } { _id : them--, value : 1 } { _id : there’s, value : 3 } { _id : there,, value : 1 } { _id : there., value : 1 } { _id : thing,, value : 1 } { _id : thing?, value : 1 } { _id : things, value : 3 } { _id : things., value : 3 } { _id : think, value : 5 } { _id : thirsty,, value : 1 } { _id : this, value : 7 } { _id : this., value : 1 } { _id : those, value : 1 } { _id : thought, value : 1 } { _id : threaded, value : 1 } { _id : throw, value : 1 } { _id : tidy., value : 2 } has more Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 13. Sample Input Data I 2A9EABFB35F5B954 970916105432 +md foods +proteins BED75271605EBD0C 970916001949 yahoo chat BED75271605EBD0C 970916001954 yahoo chat BED75271605EBD0C 970916003523 yahoo chat BED75271605EBD0C 970916011322 yahoo search BED75271605EBD0C 970916011404 yahoo chat BED75271605EBD0C 970916011422 yahoo chat BED75271605EBD0C 970916012756 yahoo caht BED75271605EBD0C 970916012816 yahoo chat BED75271605EBD0C 970916023603 yahoo chat BED75271605EBD0C 970916025458 yahoo caht BED75271605EBD0C 970916025516 yahoo chat BED75271605EBD0C 970916030348 yahoo chat BED75271605EBD0C 970916034807 yahoo chat BED75271605EBD0C 970916040755 yahoo chat BED75271605EBD0C 970916090700 hawaii chat universe BED75271605EBD0C 970916094445 yahoo chat BED75271605EBD0C 970916191427 yahoo chat BED75271605EBD0C 970916201045 yahoo chat BED75271605EBD0C 970916201050 yahoo chat BED75271605EBD0C 970916201927 yahoo chat 824F413FA37520BF 970916184809 garter belts 824F413FA37520BF 970916184818 garter belts 824F413FA37520BF 970916184939 lingerie 824F413FA37520BF 970916185051 spiderman 824F413FA37520BF 970916185155 tommy hilfiger 824F413FA37520BF 970916185257 calgary 824F413FA37520BF 970916185513 calgary 824F413FA37520BF 970916185605 exhibitionists 824F413FA37520BF 970916190220 exhibitionists 824F413FA37520BF 970916191233 exhibitionists Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 14. Sample Input Data II 7A8D9CFC957C7FCA 970916064707 duron paint 7A8D9CFC957C7FCA 970916064731 duron paint A25C8C765238184A 970916103534 brookings A25C8C765238184A 970916104751 breton liberation front A25C8C765238184A 970916105238 breton A25C8C765238184A 970916105322 breton liberation front A25C8C765238184A 970916105539 breton A25C8C765238184A 970916105628 breton A25C8C765238184A 970916105723 front de liberation de la bretagne A25C8C765238184A 970916105857 front de liberation de la bretagne 6589F4342B215FD4 970916125147 afghanistan 6589F4342B215FD4 970916125158 afghanistan 6589F4342B215FD4 970916125407 afghanistan 16DE160B4FFE3B85 970916050356 eia rs232-c 16DE160B4FFE3B85 970916050645 nullmodem 16DE160B4FFE3B85 970916050807 nullmodem 563FC9A7E8A9022A 970916042826 organizational chart 563FC9A7E8A9022A 970916221456 organizational chart of uae’s companies 563FC9A7E8A9022A 970916221611 organizational chart of dubai dutiy free Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 15. The Pig Script I -- Query Phrase Popularity (local mode) -- This script processes a search query log file from the Excite search engine and finds search phrases that occur with particular high frequency during certain times of the day. -- Register the tutorial JAR file so that the included UDFs can be called in the script. -- Based on the Pig tutorial ,modified for Mongo support tests REGISTER src/examples/pigtutorial.jar; REGISTER mongo-hadoop.jar; REGISTER lib/mongo-java-driver.jar; -- Use the PigStorage function to load the excite log file into the raw bag as an array of records. -- Input: (user,time,query) raw = LOAD ’excite-small.log’ USING PigStorage(’t’) AS (user, time, query); -- Call the NonURLDetector UDF to remove records if the query field is empty or a URL. clean1 = FILTER raw BY org.apache.pig.tutorial.NonURLDetector(query); -- Call the ToLower UDF to change the query field to lowercase. clean2 = FOREACH clean1 GENERATE user, time, org.apache.pig.tutorial.ToLower(query) as query; -- Because the log file only contains queries for a single day, we are only interested in the hour. -- The excite query log timestamp format is YYMMDDHHMMSS. -- Call the ExtractHour UDF to extract the hour (HH) from the time field. houred = FOREACH clean2 GENERATE user, org.apache.pig.tutorial.ExtractHour(time) as hour, query; Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 16. The Pig Script II -- Call the NGramGenerator UDF to compose the n-grams of the query. ngramed1 = FOREACH houred GENERATE user, hour, flatten(org.apache.pig.tutorial.NGramGenerator(query)) as ngram; -- Use the DISTINCT command to get the unique n-grams for all records. ngramed2 = DISTINCT ngramed1; -- Use the GROUP command to group records by n-gram and hour. hour_frequency1 = GROUP ngramed2 BY (ngram, hour); -- Use the COUNT function to get the count (occurrences) of each n-gram. hour_frequency2 = FOREACH hour_frequency1 GENERATE flatten($0), COUNT($1) as count; -- Use the GROUP command to group records by n-gram only. -- Each group now corresponds to a distinct n-gram and has the count for each hour. uniq_frequency1 = GROUP hour_frequency2 BY group::ngram; -- For each group, identify the hour in which this n-gram is used with a particularly high frequency. -- Call the ScoreGenerator UDF to calculate a popularity score for the n-gram. uniq_frequency2 = FOREACH uniq_frequency1 GENERATE flatten($0), flatten(org.apache.pig.tutorial.ScoreGenerator($1)); -- Use the FOREACH-GENERATE command to assign names to the fields. uniq_frequency3 = FOREACH uniq_frequency2 GENERATE $1 as hour, $0 as ngram, $2 as score, $3 as count, $4 as mean; -- Use the FILTER command to move all records with a score less than or equal to 2.0. filtered_uniq_frequency = FILTER uniq_frequency3 BY score 2.0; -- Use the ORDER command to sort the remaining records by hour and score. Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 17. The Pig Script III ordered_uniq_frequency = ORDER filtered_uniq_frequency BY hour, score; -- Use the PigStorage function to store the results. -- Output: (hour, n-gram, score, count, average_counts_among_all_hours) STORE ordered_uniq_frequency INTO ’mongodb://localhost/test.pig.output’ USING com.mongodb.hadoop.pig.MongoStorage; Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 18. The Output I db.pig.output.find() { _id : ObjectId(4cbbbd9487e836b7e0dc1052), hour : 07, ngram : new, score : 2.4494897427831788, count : NumberLong(2), mean : 1.1428571428571426 } { _id : ObjectId(4cbbbd9487e836b7e1dc1052), hour : 08, ngram : pictures, score : 2.04939015319192, count : NumberLong(3), mean : 1.4999999999999998 } { _id : ObjectId(4cbbbd9487e836b7e2dc1052), hour : 08, ngram : computer, score : 2.4494897427831788, count : NumberLong(2), mean : 1.1428571428571426 } { _id : ObjectId(4cbbbd9487e836b7e3dc1052), hour : 08, ngram : s, score : 2.545584412271571, count : NumberLong(3), mean : 1.3636363636363635 } { _id : ObjectId(4cbbbd9487e836b7e4dc1052), hour : 10, ngram : free, score : 2.2657896674010605, count : NumberLong(4), mean : 1.923076923076923 } { _id : ObjectId(4cbbbd9487e836b7e5dc1052), hour : 10, ngram : to, score : 2.6457513110645903, count : NumberLong(2), mean : 1.125 } { _id : ObjectId(4cbbbd9487e836b7e6dc1052), hour : 10, ngram : pics, score : 2.794002794004192, count : NumberLong(3), mean : 1.3076923076923075 } { _id : ObjectId(4cbbbd9487e836b7e7dc1052), hour : 10, ngram : school, score : 2.828427124746189, count : NumberLong(2), mean : 1.1111111111111114 } { _id : ObjectId(4cbbbd9487e836b7e8dc1052), hour : 11, ngram : pictures, score : 2.04939015319192, count : NumberLong(3), mean : 1.4999999999999998 } { _id : ObjectId(4cbbbd9487e836b7e9dc1052), hour : 11, ngram : in, score : 2.1572774865200244, count : NumberLong(3), mean : 1.4285714285714284 } { _id : ObjectId(4cbbbd9487e836b7eadc1052), hour : 13, ngram : the, score : 3.1309398305840723, count : NumberLong(6), mean : 1.9375 } { _id : ObjectId(4cbbbd9487e836b7ebdc1052), hour : 14, ngram : music, score : 2.1105794120443453, count : NumberLong(4), mean : 1.6666666666666667 } { _id : ObjectId(4cbbbd9487e836b7ecdc1052), hour : 14, ngram : city, score : 2.2360679774997902, count : NumberLong(2), mean : 1.1666666666666665 } { _id : ObjectId(4cbbbd9487e836b7eddc1052), hour : 14, ngram : university, score : 2.412090756622109, count : NumberLong(3), mean : 1.4000000000000001 } { _id : ObjectId(4cbbbd9487e836b7eedc1052), hour : 15, ngram : adult, score : 2.8284271247461903, count : NumberLong(2), mean : 1.1111111111111112 } Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 19. The Output II { _id : ObjectId(4cbbbd9487e836b7efdc1052), hour : 17, ngram : chat, score : 2.9104275004359965, count : NumberLong(3), mean : 1.2857142857142854 } { _id : ObjectId(4cbbbd9487e836b7f0dc1052), hour : 19, ngram : in, score : 2.1572774865200244, count : NumberLong(3), mean : 1.4285714285714284 } { _id : ObjectId(4cbbbd9487e836b7f1dc1052), hour : 19, ngram : car, score : 2.23606797749979, count : NumberLong(3), mean : 1.3333333333333333 } { _id : ObjectId(4cbbcf8194b936b7073ae8cb), hour : 07, ngram : new, score : 2.4494897427831788, count : NumberLong(2), mean : 1.1428571428571426 } { _id : ObjectId(4cbbcf8194b936b7083ae8cb), hour : 08, ngram : pictures, score : 2.04939015319192, count : NumberLong(3), mean : 1.4999999999999998 } has more Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17
  • 20. Questions? Contact Info Contact Me twitter: @rit email: brendan@10gen.com mongo-hadoop . . . Available Soon in Alpha Form: http://github.com/mongodb/mongo-hadoop Pressing Questions? IRC - freenode.net #mongodb MongoDB Users List - http://groups.google.com/group/mongodb-user Mongo Chicago Conference - October 2010 B.W. McAdams (10gen) The Elephant In the Room: MongoDB + Hadoop / 17