SlideShare a Scribd company logo
1 of 81
Download to read offline
The road lies plain before
                                                                       me;--'tis a theme
                                                                          Single and of
                                                                 determined bounds; …
                                                            - Wordsworth, The Prelude

                                                              m
                                                    pre ss.co
                                             . word            ol
                                     bl eclix         te Scho
                            p:/ /dou          Gr adua            1
                  ka  r, htt        val Post             2 9,201
           n a San             r, Na                Nov
     Krish                 in a
                  st Sem
         hD   Gue
    00–P
EC40
What is
     Big
    Data ?	

                      Big
                     Data to
                     smart
                      data	

                                             Big
o  Agenda                                   Data
   o  To cover the broad                   Pipeline	

      picture
   o  Understand the
      waypoints &
   o  Drill down into one
      area (NOSQL)               Analytics/
                                 Modeling
                                                     Analytic     Storage -
                                     R
                                                    Algorithms     NOSQL	

   o  Can do others later
      …
                                                   Processing -
o  Of the Big Data              Visualization
                                                     Hadoop
                                                                     …

   domain …
Thanks to …
The giants whose
 shoulders I am
  standing on 




                                                                            Special	
  Thanks	
  to:	
  
                                                         	
  	
  	
  Peter	
  Ateshian,	
  NPS	
  
                               	
  	
  	
  Prof	
  Murali	
  Tummala,	
  NPS	
  
                                              	
  	
  	
  Shirley	
  Bailes,O’Reilly	
  
                                                               	
  	
  	
  Ed	
  Dumbill,O’Reilly	
  
                                                                                 	
  	
  	
  Jeff	
  Barr,AWS	
  
                   	
  	
  	
  Jenny	
  Kohr	
  Chynoweth,AWS	
  
When I think of my own native land, 
             In a moment I seem to be there;  

            But, alas! recollection at hand 
 
           Soon hurries me back to despair.
- Cowper, The Solitude Of Alexander SelKirk
What is Big Data ?
“Big data” is data                                              “Big data” is less
 that becomes large                                             about size, more
  enough that it                                            about flow & velocity
  cannot be processed                                              - persisting
 using conventional                                         petabytes per year is
 methods. @twitter	

                                                easier than
                                                             processing terabytes
                                                              per hour. @twitter	





                        Ref:	
  hIp://radar.oreilly.com/2010/09/the-­‐smaq-­‐stack-­‐for-­‐big-­‐data.html	
  
What is Big Data ?

                    Vinod Khosla’s Cool Dozen!
                                          Consumers : “Widespread innovation in
                                       technologies that reduce data overload for
                                                         users” ~ Data Reduction	

                                         Businesses : “Simple solutions to handle
                                       the deluge of data generated from various
                                               sources …” ~ Big Data Analytics  	

                                  TV	
  2.0,	
  EducaXon,	
  Social	
  NEXT,Tools	
  for	
  sharing	
  inteerst,Publishing,…	
  



                 Ref:	
  hIp://www.ciol.com/News/News/News-­‐Reports/Vinod-­‐Khosla%E2%80%99s-­‐cool-­‐dozen-­‐tech-­‐innovaXons/156307/0/	
  
hIp://yourstory.in/2011/11/vinod-­‐khoslas-­‐keynote-­‐at-­‐nasscom-­‐product-­‐conclave-­‐reject-­‐punditry-­‐believe-­‐in-­‐an-­‐idea-­‐take-­‐risk-­‐and-­‐succeed/	
  
EBC322	
  


  Volume	

o    Scale	
  
  Velocity	

o    Data	
  change	
  rate	
  vs.	
  decision	
  window	
  
  Variety	

o    Different	
  sources	
  &	
  formats	
  
o    Structured	
  vs.	
  Unstructured	
  
  Variability	

o    Breadth	
  of	
  interpreta<on	
  &	
  
o    Depth	
  of	
  analy<cs	
  
  Contextual	

o    Dynamic	
  variability	
  
o    RecommendaXon	
  
  Connectedness	

                     hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/	
  
                                                         hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf	
  
EBC322	
  


  Volume	

o    Scale	
  
  Velocity	

o    Data	
  change	
  rate	
  vs.	
  decision	
  window	
  
  Variety	

o    Different	
  sources	
  &	
  formats	
  
o    Structured	
  vs.	
  Unstructured	
  
  Variability	

o    Breadth	
  of	
  interpreta<on	
  &	
  
o    Depth	
  of	
  analy<cs	
  
  Contextual	

o    Dynamic	
  variability	
  
o    RecommendaXon	
  
  Connectedness	

                     hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/	
  
                                                         hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf	
  
EBC322	
  


  Volume	

o    Scale	
  
  Velocity	

o    Data	
  change	
  rate	
  vs.	
  decision	
  window	
  
  Variety	

o    Different	
  sources	
  &	
  formats	
  
o    Structured	
  vs.	
  Unstructured	
  
  Variability	

o    Breadth	
  of	
  interpreta<on	
  &	
  
o    Depth	
  of	
  analy<cs	
  
  Contextual	

o    Dynamic	
  variability	
  
o    RecommendaXon	
  
  Connectedness	

                     hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/	
  
                                                         hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf	
  
EBC322	
  


  Volume	

o    Scale	
  
  Velocity	

o    Data	
  change	
  rate	
  vs.	
  decision	
  window	
  
  Variety	

o    Different	
  sources	
  &	
  formats	
  
o    Structured	
  vs.	
  Unstructured	
  
  Variability	

o    Breadth	
  of	
  interpreta<on	
  &	
  
o    Depth	
  of	
  analy<cs	
  
  Contextual	

o    Dynamic	
  variability	
  
o    RecommendaXon	
  
  Connectedness	

                     hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/	
  
                                                         hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf	
  
EBC322	
  


  Volume	

o    Scale	
  
  Velocity	

o    Data	
  change	
  rate	
  vs.	
  decision	
  window	
  
  Variety	

o    Different	
  sources	
  &	
  formats	
  
o    Structured	
  vs.	
  Unstructured	
  
  Variability	

o    Breadth	
  of	
  interpreta<on	
  &	
  
o    Depth	
  of	
  analy<cs	
  
  Contextual	

o    Dynamic	
  variability	
  
o    RecommendaXon	
  
  Connectedness	

                     hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/	
  
                                                         hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf	
  
I.       Two	
  Main	
  Types	
  –	
  based	
  on	
  collecXon	
  
       i.      Big	
  Data	
  Streams	
  
              o       Data	
  in	
  “moXon”	
  
              o       TwiIer	
  fire	
  hose,	
  Facebook,	
  G+	
  	
  
       ii.     Big	
  Data	
  Logs	
  
              o       Data	
  “at	
  rest”	
  
              o       Logs,	
  DW,	
  external	
  market	
  data,	
  POS,	
  …	
  
II.      Typically,	
  Big	
  Data	
  has	
  a	
  non-­‐determinisXc	
  angle	
  as	
  well	
  …	
  
       o       CreaXve	
  Discovery	
  
       o       IteraXve,	
  Model	
  based	
  AnalyXcs	
  
       o       Explore	
  quesXons	
  to	
  ask	
  
III.  Smart	
  Data	
  =	
  Big	
  Data	
  +	
  context	
  +	
  embedded/interacXve	
  (inference,	
  
      reasoning)	
  models	
  
        o  Model	
  Driven	
  
        o  DeclaraXvely	
  InteracXve	
  

                                                                    hIp://www.slideshare.net/leonsp/hadoop-­‐slides-­‐11-­‐what-­‐is-­‐big-­‐data	
  
                                                                 hIp://www.slideshare.net/Dataversity/wed-­‐1550-­‐bacvanskivladimircolor	
  
AWS – 600 Billion
                                                                  objects!

Twitter	

  §      200 million tweets/day	

  §      Peak 10,000/second	

  §      How would you handle the fire
          hose for social network analytics 	

                                            ?
                                    Zynga	

                                        §      “Analytics company, not a
                                                gaming company!”	

                                        §      Harvests data : 15 TB/day	

Storage	

                                    §    Test new features	

    §     4 U box = 40 TB,	

                §    Target advertising	

           1 PB = 25 boxes !	

    § 
                                        §      230 million players/month	

                                                                      hIp://goo.gl/dcBsQ	
  
•  6	
  Billion	
  Messages	
  per	
  
   day	
  
•  2	
  PB	
  (w/compression)	
  
   online	
  
•  6	
  PB	
  w/	
  replicaXon	
  
•  250	
  TB/Month	
  growth	
  
•  HBase	
  Infrastructure	
  
50	
  TB/Day	
                                                        Very	
  systemaXc	
  
                                                   240	
  nodes,	
  84	
  PB	
               Diagram	
  speaks	
  volumes!	
  
Path	
  Analysis	
                                 Teradata	
  InstallaXon	
  
A/B	
  TesXng	
  
                                     Ref:	
  hIp://www.hpts.ws/sessions/2011HPTS-­‐TomFastner.pdf	
  
•  “…	
  they	
  didn’t	
  need	
  a	
  genius,	
  …	
  but	
  build	
  the	
  world’s	
  most	
  impressive	
  
      dileIante	
  …	
  baIling	
  the	
  efficient	
  human	
  mind	
  with	
  spectacular	
  
      flamboyant	
  inefficiency”	
  –	
  Final	
  Jeopardy	
  by	
  Stephen	
  Baker	
  
   •  15	
  TB	
  memory,	
  across	
  90	
  IBM	
  760	
  servers,	
  in	
  10	
  racks	
  
   •  1	
  TB	
  of	
  dataset	
  
   •  200	
  Million	
  pages	
  processed	
  by	
  Hadoop	
  
   •  This	
  is	
  a	
  good	
  example	
  of	
  Connected	
  data	
  
          –  Contextual	
  w/	
  variability	
  
          –  Breath	
  of	
  interpretaXon	
  
          –  AnalyXcs	
  depth	
  




hIp://doubleclix.wordpress.com/2011/03/01/the-­‐educaXon-­‐of-­‐a-­‐machine-­‐%E2%80%93-­‐review-­‐of-­‐book-­‐%E2%80%9Cfinal-­‐jeopardy
%E2%80%9D-­‐by-­‐stephen-­‐baker/	
  
hIp://doubleclix.wordpress.com/2011/02/17/watson-­‐at-­‐jeopardy-­‐a-­‐race-­‐of-­‐machines/	
  
Warehouse-­‐style	
  
 ApplicaXons	
  
                                                                                                          Block	
  Store	
  
 Distributed	
                                        Big Data
 ApplicaXons	
  
                                                                                Storage	
                 Object	
  Store	
  
                                           NOSQL	
  

               AnalyXcs	
                                                      Parallelism	
              Map/Reduce	
  

  Web	
                                                                      HPC	
  
AnalyXcs	
  
                                                                                              Cloud	
          Architecture	
  
                                               Social	
  Media	
  
   Log	
                                                                     Inference	
  
 AnalyXcs	
  
                              Social	
  	
                                                RecommendaXon/
                              Graph	
                                                     Inference	
  Engines	
  
                                                                                           Machine	
  
                Knowledge	
                                   Search,	
                    Learning	
              Mahout	
  
                  Graph	
                                     Indexing	
  
                                                                                          ClassificaXon,	
  Clustering	
  
“A towel is about the most massively useful thing an
     interstellar hitchhiker can have … any man who can
     hitch the length and breadth of the Galaxy, rough it …
     win through, and still know where his towel is, is clearly
     a man to be reckoned with.” 
                     - From The Hitchhiker's Guide to the Galaxy, by Douglas Adams. 
                                                 Published by Harmony Books in 1979




Big  Data  to  Smart  Data
Don’t  throw  away  
 1	
 any  data  !	


      Big  data  to  smart  data	
     Be  ready  for  different  
 2	
 ways  of  organizing  
     the  data	
•  summary




                                  h;p://goo.gl/fGw7r
Big  Data  Pipeline	


     If a problem has no solution, it is not a problem,
     but a fact, not to be solved but to be coped with,
     over time …
                                             - Peres’s Law
Big  Data  Pipeline	
•  Stages
   o    Collect
   o    Store
   o    Transform & Analyze
   o    Model & Reason
   o    Predict, Recommend & Visualize
•  Different systems have different characteristics
   o  Infrastructure optimization based in application/hardware
      attributes correlation (short term)
        •  Hadoop, Splunk, internal Dashboard
   o  Application performance trends (medium term)
        •  Analytics, Modeling,…
   o  Product Metrics
        •  Feature set vs. usage, what is important to users, stratification
        •  Modeling using R, Visualization layers like Tableau
Big Data Pipeline
                                                                                     Ref:h;p:goo.gl/Mm83k	

                                                                                               Infer-ability	


                                                                             Model	

                             Internal	
  
                                                                                                                  dashboards
                                                                                        Hand	
                    ,	
  Tableau	
  
                                            Context	

                                  coded	
                   	
  
                                                                                        Programs,	
  
                               Connectedness	

                                         R,	
  Mahout,	
  
                                                                                        …	
  
                                                          SQL,	
  	
                    	
  
                     Variety	

                           BI	
  Tools,	
  
                                                          Hadoop,	
  
                                                          Pig,	
  
                Variability	

 SQL	
                      Hive,	
  	
  
                                                          .NET	
  
                                 NOSQL,	
  
            Logs,	
                                       Dryad,	
  
  Velocity	

            Scribe,	
  
                                 HDFS,	
  
                                 XML,	
  
                                                          Various	
  
            Flume,	
                                      other	
  
                                 <iles,	
  …	
  
  Volume	

 Hadoop
                                 	
  
                                                          tools	
  
            …	
  




                Decomplexify!                      Contextualize!               Network!            Reason!         Infer!
Build to Fail - “It is working” is not binary	





The  NOSQL  !	


                        I AM monarch of all I survey;
                      My right there is none to dispute; 
 
                 From the centre all round to the sea 
                  I am lord of the fowl and the brute
           - Cowper, The Solitude Of Alexander SelKirk
Agenda
•  Opening Gambit
      –  NOSQL	
  :	
  Toil,	
  Tears	
  &	
  Sweat	
  !	
  
•  The Pragmas
      –  ABCs	
  of	
  NOSQL	
  [ACID,	
  BASE	
  &	
  CAP]	
  
•  The Mechanics
      –  Algorithmics	
  &	
  Mechanisms	
  (For	
  reference)	
  




Referenced Links @ http://doubleclix.wordpress.com/2010/06/20/nosql-talk-references/
What is NOSQL
                                                   Anyway ?
•  NOSQL	
  	
  !=	
  NoSQL	
  or	
  NOSQL	
  !=	
  (!SQL)	
  
•  NOSQL	
  =	
  Not	
  Only	
  SQL	
  
•  Can	
  be	
  traced	
  back	
  to	
  Eric	
  Evans[2]!	
  
      –  You	
  can	
  ask	
  him	
  during	
  the	
  ayernoon	
  session!	
  
•    Unfortunate	
  Name,	
  but	
  is	
  stuck	
  now	
  
•    Non	
  RelaXonal	
  could	
  have	
  been	
  beIer	
  
•    Usually	
  OperaXonal,	
  Definitely	
  Distributed	
  
•    NOSQL	
  has	
  certain	
  semanXcs	
  –	
  need	
  not	
  stay	
  that	
  way	
  
NOSQL	
  



   Key	
  Value	
        Column	
           Document	
             Graph	
  


  In-­‐memory	
         SimpleDB	
           CouchDB	
              Neo4j	
  

 Memcached	
             Google	
  
                                            MongoDB	
              FlockDB	
  
                        BigTable	
  
  Disk	
  Based	
  
                          HBase	
         Lotus	
  Domino	
     InfiniteGraph	
  
     Redis	
  
                       Cassandra	
              Riak	
  
Tokyo	
  Cabinet	
  

   Dynamo	
            HyperTable	
  


  Voldemort	
           Azure	
  TS	
                               Ref:	
  [22,51,52]	
  
When I think of my own native land,
                             In a moment I seem to be there;
                                But, alas! recollection at hand
                            Soon hurries me back to despair.
                 - Cowper, The Solitude Of Alexander SelKirk




NOSQL Tales from the field
WHAT WORKS
•  Designer Augmenting RDBMS with a Distributed key
   Value Store[40 : A good talk by Geir]
•  Invitation only designer brand sales
•  Limited inventory sales – start at 12:00, members have
   10 min to grab them. 500K mails every day
•  Keeps brand value, hidden from search
•  Interesting load properties
•  Each item a row in DB-BUY NOW reserves it
   –  Can't order more
•  Started out as a Rails app
   –  shared nothing
•  Narrow peaks – half of revenue
Christian Louboutin
                                   Effect


•  ½ amz for Louboutin
•  Use Voldemort
•  Inventory, Shopping Cart,
   Checkout
•  Partition by prod ID
•  Shared infrastructure – “fog”
   not “cloud’ - Joyent!
•  In-memory inventory
•  Not afraid of sale anymore!
           And SQL DBs are
           still relevant !
Typical NOSQL Example Bit.ly
•  Bit,ly URL shortening service, uses MongoDB
•  User, title, URL, hash, labels[I-5], sort by time
•  Scale – ~50M users, ~10K concurrent, ~1.25B shortens
   per month
•  Criteria:
   –  Simple, Zippy FAST, Very Flexible, Reasonable Durability, Low
      cost of ownership
•  Sharded by userid
•  New kind of “dictionary” a word repository, GPS for
   English – context, pronunciations, twitter … developer
   API
•  Characteristics[I-6,Tony Tam’s presentation]
   –  RO-centric, 10,000 reads for every write
   –  Hit a wall with MySQL (4B rows)
   –  MongoDB read was so good that memcached layer was not
      required
   –  MongoDB used 4 times MySQL storage
•  Another example :
   –  Voldemort – Unified Communications, IP-Phone data stored
      keyed off of phone number. Data relatively stable
Large Hadron Collider@CERN
•  DAS is part of giant data management
   enterprise (cms)
      –  Polygot Persistence (SQL + NOSQL, Mongo, Couch,
         memcache, HDFS, Luster, Oracle, mySQL, …)
•    Data Aggregation System [I-1,I-2,I-3,I-4]
      –  Uses MongoDB
      –  Distributed Model, 2-6 pb data
      –  Combine info. from different metadata sources, query
         without knowing their existence, user has domain
         knowledge – but shouldn’t deal with various formats,
         interfaces and query semantics
      –  DAS aggregates, caches and presents data as JSON
         documents – preserving security & integrity




                                            And SQL DBs are
                                            still relevant !
Scaling Twitter
• 
•  Digg
   –  RDBMS places burden on reads than writes[I-8]
   –  Looked at NOSQL, selected Cassandra
       •  Colum oriented, so more structure than key-value
•  Heard from noSQL Boston[http://twitter.com/
  #search?q=%23nosqllive]
   –  Baidu: 120 node HyperTable cluster managing
      600TB of data
   –  StumbleUpon uses HBase for Analytics
   –  Twitter’s Current Cassandra cluster: 45 nodes
•  Adob is a HBase shop          •  BBC is a CouchDB shop
  [I-10,I-11,2]                     [I-13]
•  Adobe SaaS Infrastructure – •  Sweet spot:
   tagging, content aggregation,     •  Multi-master, multi
   search, storage and so forth         datacenter replication
•  Dynamic schema & huge
   number of records[I-5]
•  40 million records in 2008 to
   1 billion with 50 ms response •  Interactive Mediums
•  NOSQL not mature in 2008,         •  Old data to CouchDB
   now good enough                   •  Thus free up DB to do
•  Prod Analytics:40 nodes,             work!
   largest has 100 nodes
•  Cloudkick is a Cassandra shop[I-12]
•  Cloudkick offers cloud management services
•  Store metrics data
•  Linear scalability for write load
•  Massive write performance
    •  Memory table & serial commit log
•  Low operational costs
•  Data Structure
     –  Metrics, Rolled-up data, Statuses at time slice : all indexed by
        timestamp
•  Guardian/UK
   –  Runs on Redis[I-14] !
   –  “Long-term The Guardian is looking
      towards the adoption of a schema-free
      database to sit alongside its Oracle
      database and is investigating CouchDB.
      … the relational database is now just a
      component in the overall data
      management story, alongside data
      caching, data stores, search engines
                                                And SQL DBs are
      etc.
                                                still relevant !
   –  NOSQL can increase performance of         "The evil that SQL
      relational data by offloading specific    DBs do lives after
      data and tasks                            them; the good is
                                                oft interred with
                                                their bones...",
NOSQL at Netflix
•  Netflix is fully in the cloud
•  Uses NOSQL across the globe
•  Customer Profiles, watchlog, usage logging (see next
   slide)
     –  No multi-record locking
•    No DBA !
•    Easier Schema Changes
•    Less complex, Highly Available data store
•    Joins happen in the applications




                                  http://www.hpts.ws/sessions/nosql-ecosystem.pdf
                                  http://www.hpts.ws/sessions/GlobalNetflixHPTS.pdf
21 NOSQL Themes
•  Web	
  Scale	
  
•  Scale	
  Incrementally/conXnuous	
  growth	
  
•  Oddly	
  shaped	
  &	
  exponenXally	
  connected	
  
•  Structure	
  data	
  as	
  it	
  will	
  be	
  used	
  –	
  i.e.	
  read,	
  query	
  
•  Know	
  your	
  queries/updates	
  in	
  advance[96],	
  but	
  you	
  can	
  change	
  
   them	
  later	
  
•  Compute	
  aIributes	
  at	
  run	
  Xme	
  
•  Create	
  a	
  few	
  large	
  enXXes	
  with	
  opXonal	
  parts	
  
      –  NormalizaXon	
  creates	
  many	
  small	
  enXXes	
  
•    Define	
  Schemas	
  in	
  models	
  (not	
  in	
  databases)	
  
•    Avoid	
  impedance	
  mismatch	
  
•    Narrow	
  down	
  &	
  solve	
  your	
  core	
  problem	
  
•    Solve	
  the	
  right	
  problem	
  with	
  the	
  right	
  tool	
  


                                                                                              Ref:	
  [I-­‐8]	
  
21 NOSQL Themes
•  ExisXng	
  soluXons	
  are	
  clunky[1]	
  (in	
  certain	
  situaXons)	
  
•  Scale	
  automaXcally,	
  “becoming	
  prohibiXvely	
  costly	
  (in	
  
   terms	
  of	
  manpower)	
  to	
  operate”	
  TwiIer[I-­‐9]	
  	
  
     •  DistribuXon	
  &	
  parXXoning	
  are	
  built-­‐in	
  NOSQL	
  
•  RDBMS	
  distribuXon	
  &	
  sharding	
  not	
  fun	
  and	
  is	
  expensive	
  
    –  Lose	
  most	
  funcXonality	
  along	
  the	
  way	
  
•  Data	
  at	
  the	
  center,	
  Flexible	
  schema,	
  Less	
  joins	
  
•  The	
  value	
  of	
  NOSQL	
  is	
  in	
  flexibility	
  as	
  much	
  as	
  it	
  is	
  in	
  “Big	
  
   Data”	
  
21 NOSQL Themes
•  Requirements[3]	
  
    –  Data	
  will	
  not	
  fit	
  in	
  one	
  node	
  
          •  And	
  so	
  need	
  data	
  parXXon/distribuXon	
  by	
  the	
  system	
  
    –  Nodes	
  will	
  fail,	
  but	
  data	
  needs	
  to	
  be	
  safe	
  –	
  replicaXon!	
  
    –  Low	
  latency	
  for	
  real-­‐Xme	
  use	
  
•  Data	
  Locality	
  
    –  Row	
  based	
  structures	
  will	
  need	
  to	
  read	
  whole	
  row,	
  
       even	
  for	
  a	
  column	
  
    –  Column	
  based	
  structures	
  need	
  to	
  scan	
  for	
  each	
  row	
  
•  SoluXon	
  :	
  Column	
  storage	
  with	
  Locality	
  	
  
    –  Keep	
  data	
  that	
  is	
  read	
  together,	
  don’t	
  read	
  what	
  you	
  
       don’t	
  care	
  
          •  For	
  example	
  friends	
  –	
  other	
  data	
  

                                                                                                Ref:	
  3	
  
ABCs of
 NOSQL -
  ACID,
 BASE &
  CAP
The woods are lovely, dark, and deep, 
          But I have promises to keep, 
       And miles to go before I sleep, 
       And miles to go before I sleep.
                               -Frost
CAP Principle
“CAP	
  Principle	
  →	
  	
  
        	
  Strong	
  Consistency,	
  	
  
        	
  High	
  Availability,	
  	
                Consistency
        	
  Par::on-­‐resilience:	
  	
  
Pick	
  at	
  most	
  2”[37]



                            Availability                                                 Partition




 Which	
  feature	
  to	
  discard	
  depends	
  on	
  the	
  nature	
  of	
  your	
  system[41]	
  
CAP Principle
“CAP	
  Principle	
  →	
  	
  
         	
  Strong	
  Consistency,	
  	
  
         	
  High	
  Availability,	
  	
  
                                                       Consistency
         	
  Par::on-­‐resilience:	
  	
  
Pick	
  at	
  most	
  2”[37]	
  
C-­‐A	
  No	
  P	
  →	
  Single	
  DB	
  
server,	
  no	
  network	
  par::on	
  



                            Availability                                                 Partition




 Which	
  feature	
  to	
  discard	
  depends	
  on	
  the	
  nature	
  of	
  your	
  system[41]	
  
CAP Principle
“CAP	
  Principle	
  →	
  	
  
        	
  Strong	
  Consistency,	
  	
  
        	
  High	
  Availability,	
  	
                 Consistency
        	
  Par::on-­‐resilience:	
  	
  
Pick	
  at	
  most	
  2”[37]	
  
                                                   C-­‐P	
  No	
  A	
  →	
  Block	
  
                                                   transac:on	
  in	
  
                                                   case	
  of	
  par::on	
  
                                                   failure	
  
                            Availability                                                 Partition




 Which	
  feature	
  to	
  discard	
  depends	
  on	
  the	
  nature	
  of	
  your	
  system[41]	
  
CAP Principle
                                                     Interesting (& controversial) from
“CAP	
  Principle	
  →	
  	
                         NOSQL perspective	


        	
  Strong	
  Consistency,	
  	
  
        	
  High	
  Availability,	
  	
        Consistency

        	
  Par::on-­‐resilience:	
  	
  
Pick	
  at	
  most	
  2”[37]	
            A-­‐P	
  No	
  C	
  →	
  
                                                Expira:on	
  based	
  
                                                caching,	
  vo:ng	
  
                                                majority	
  
                          Availability                                           Partition
ABCs	
  of	
  NOSQL	
  
•  ACID	
  
    o  Atomicity,	
  Consistency,	
  IsolaXon	
  &	
  Durability	
  –	
  
       fundamental	
  properXes	
  of	
  SQL	
  DBMS	
  
•  BASE[35,39]	
  
    o  Basically	
  Available	
  Soy	
  state(Scalable)	
  
       Eventually	
  Consistent	
  	
  
•  CAP[36,39]	
  
    o  Consistency,	
  Availability	
  &	
  ParXXoning	
  
    o  This	
  C	
  is	
  ~A+C	
  
         •  i.e.	
  Atomic	
  Consistency[36]	
  
ACID	
  
•  Atomicity	
  
    o  All	
  or	
  nothing	
  
•  Consistent	
  
    o  From	
  one	
  consistent	
  state	
  to	
  another	
  
          •  e.g.	
  ReferenXal	
  Integrity	
  
    o  But	
  it	
  is	
  also	
  applicaXon	
  dependent	
  on	
  	
  
          •  e.g.	
  min	
  account	
  balance	
  
          •  Predicates,	
  invariants,…	
  
•  IsolaXon	
  
•  Durability	
  
CAP	
  Pragmas	
  
•  PrecondiXons	
  
    o    The	
  domain	
  is	
  scalable	
  web	
  apps	
  
    o    Low	
  Latency	
  For	
  real	
  Xme	
  use	
  
    o    A	
  small	
  sub-­‐set	
  of	
  SQL	
  FuncXonality	
  
    o    Horizontal	
  Scaling	
  
•  PritcheI[35]	
  talks	
  about	
  relaxing	
  consistency	
  
   across	
  funcXonal	
  groups	
  than	
  within	
  funcXonal	
  
   groups	
  
•  Idempotency	
  to	
  consider	
  
    o  Updates	
  inc/dec	
  are	
  rarely	
  idempotent	
  
    o  Order	
  preserving	
  trx	
  are	
  not	
  idempotent	
  either	
  
    o  MVCC	
  is	
  an	
  answer	
  for	
  this	
  (CouchDB)	
  
Consistency	
  
•  Strict	
  Consistency	
  
   o Any	
  read	
  on	
  Data	
  X	
  will	
  return	
  the	
  most	
  
     recent	
  write	
  on	
  X[42]	
  
•  SequenXal	
  Consistency	
  
   o Maintains	
  sequenXal	
  order	
  from	
  
     mulXple	
  processes	
  (No	
  menXon	
  of	
  Xme)	
  
•  Linearizability	
  
   o Add	
  Xmestamp	
  from	
  loosely	
  
     synchronized	
  processes	
  
Consistency	
  
•  Write	
  availability,	
  not	
  read	
  availability[44]	
  
•  Even	
  load	
  distribuXon	
  is	
  easier	
  in	
  
   eventually	
  consistent	
  systems	
  
•  MulX-­‐data	
  center	
  support	
  is	
  easier	
  in	
  
   eventually	
  consistent	
  systems	
  
•  Some	
  problems	
  are	
  not	
  solvable	
  with	
  
   eventually	
  consistent	
  systems	
  
•  Code	
  is	
  someXmes	
  simpler	
  to	
  write	
  in	
  
   strongly	
  consistent	
  systems	
  
CAP	
  EssenXals	
  –	
  1	
  of	
  3	
  
•  “CAP	
  Principle	
  →	
  Strong	
  Consistency,	
  High	
  
   Availability,	
  ParXXon-­‐resilience:	
  Pick	
  at	
  
   most	
  2”[37]	
  
    o  C-­‐A	
  No	
  P	
  →	
  Single	
  DB	
  server,	
  no	
  network	
  
       parXXon	
  
    o  C-­‐P	
  No	
  A	
  →	
  Block	
  transacXon	
  in	
  case	
  of	
  
       parXXon	
  failure	
  
    o  A-­‐P	
  No	
  C	
  →	
  ExpiraXon	
  based	
  caching,	
  voXng	
  
       majority	
  
•  Which	
  feature	
  to	
  discard	
  depends	
  on	
  the	
  
   nature	
  of	
  your	
  system[41]	
  
CAP	
  EssenXals	
  –	
  2	
  of	
  3	
  
•  Yield	
  vs.	
  Harvest[37]	
  
    o  Yield	
  →	
  Probability	
  of	
  compleXng	
  a	
  request	
  
    o  Harvest	
  →	
  FracXon	
  of	
  data	
  reflected	
  in	
  the	
  
       response	
  
•  Some	
  systems	
  tolerate	
  <	
  100%	
  harvest	
  (e.g	
  
   search	
  i.e.	
  approximate	
  answers	
  OK)	
  
   others	
  need	
  100%	
  harvest	
  (e.g.	
  Trx	
  i.e.	
  
   correct	
  behavior	
  =	
  single	
  well	
  defined	
  
   response)	
  
•  For	
  sub-­‐systems	
  that	
  tolerate	
  harvest	
  
   degradaXon,	
  CAP	
  makes	
  sense	
  	
  	
  
CAP	
  EssenXals	
  –	
  3	
  of	
  3	
  
•  Trading	
  Harvest	
  for	
  yield	
  –	
  AP	
  
•  ApplicaXon	
  decomposiXon	
  &	
  use	
  NOSQL	
  in	
  
   appropriate	
  sub-­‐systems	
  that	
  has	
  state	
  
   management	
  and	
  data	
  semanXcs	
  that	
  match	
  the	
  
   opera<onal	
  feature	
  &	
  impedance	
  
    o    Hence	
  NotOnly	
  SQL	
  not	
  No	
  SQL	
  
    o    Intelligent	
  homing	
  to	
  tolerate	
  parXXon	
  failures[44]	
  
    o    MulX	
  zones	
  in	
  a	
  region	
  (150	
  miles	
  -­‐	
  5	
  ms)	
  
    o    TwiIer	
  tweets	
  in	
  Cassandra	
  &	
  MySQL	
  
    o    BBC	
  using	
  MongoDB	
  for	
  offloading	
  DBMS	
  
    o    Polygot	
  persistence	
  at	
  LHC@CERN	
  
CAP	
  EssenXals	
  –	
  3	
  of	
  3	
  
•  Trading	
  Harvest	
  for	
  yield	
  –	
  AP	
  
•  ApplicaXon	
  decomposiXon	
  &	
  use	
  NOSQL	
  in	
  
   appropriate	
  sub-­‐systems	
  that	
  has	
  state	
  
   management	
  and	
  data	
  semanXcs	
  that	
  match	
  the	
  
   opera<onal	
  feature	
  &	
  impedance	
  
    o    Hence	
  NotOnly	
  SQL	
  not	
  No	
  SQL	
  
    o    Intelligent	
  homing	
  to	
  tolerate	
  parXXon	
  failures[44]	
  
    o    MulX	
  zones	
  in	
  a	
  region	
  (150	
  miles	
  -­‐	
  5	
  ms)	
  
    o    TwiIer	
  tweets	
  in	
  Cassandra	
  and	
  MySQL	
  
                                                          Most important
    o    BBC	
  using	
  MongoDB	
  for	
  offloading	
  DBMS	
  
                                                        point in the whole
    o    Polygot	
  persistence	
  at	
  LHC@CERN	
  
                                                         presentation
Eventual	
  Consistency	
  &	
  AMZ	
  
•  DistribuXon	
  Transparency[38]	
  
•  Larger	
  distributed	
  systems,	
  network	
  
   parXXons	
  are	
  given	
  
•  Consistency	
  Models	
  
    o  Strong	
  
    o  Weak	
  
         •  Has	
  an	
  inconsistency	
  window	
  before	
  update	
  and	
  
            guaranteed	
  	
  view	
  
    o  Eventual	
  
         •  If	
  no	
  new	
  updates,	
  all	
  will	
  see	
  the	
  value,	
  eventually	
  
Eventual	
  Consistency	
  &	
  AMZ	
  
•  Guarantee	
  variaXons[38]	
  
   o Read-­‐Your-­‐writes	
  
   o Session	
  consistency	
  
   o Monotonic	
  Read	
  consistency	
  
          •  Access	
  will	
  not	
  return	
  previous	
  value	
  
   o Monotonic	
  Write	
  consistency	
  
          •  Serialize	
  write	
  by	
  the	
  same	
  process	
  
•  Guarantee	
  order	
  (vector	
  clocks,	
  
   mvcc)	
  
   o  Example	
  :	
  Amz	
  Cart	
  merger	
  (let	
  cart	
  add	
  even	
  with	
  parXal	
  
      failure)	
  
Eventual	
  Consistency	
  &	
  AMZ	
  -­‐	
  SimpleDB	
  
•  SimpleDB	
  strong	
  consistency	
  
   semanXcs	
  [49,50]	
  	
  
   o UnXl	
  Feb	
  2010,	
  SimpleDB	
  only	
  
     supported	
  eventual	
  consistency	
  i.e.	
  
     GetAIributes	
  ayer	
  PutAIributes	
  might	
  
     not	
  be	
  the	
  same	
  for	
  some	
  Xme	
  (1	
  
     second)	
  
   o On	
  Feb	
  24,	
  AWS	
  Added	
  
     ConsistentRead=True	
  aIribute	
  for	
  read	
  
   o Read	
  will	
  reflect	
  all	
  writes	
  that	
  got	
  
     200OK	
  Xll	
  that	
  Xme!	
  
Eventual	
  Consistency	
  &	
  AMZ	
  -­‐	
  SimpleDB	
  
•  SimpleDB	
  strong	
  consistency	
  
   semanXcs	
  [49,50]	
  	
  
   o Also	
  added	
  condiXonal	
  put/delete	
  
   o Put	
  aIribute	
  has	
  a	
  specified	
  value	
  
     (Expected.1.Value=)	
  or	
  (Expected.
     1.Exists	
  =	
  true/false)	
  
   o Same	
  condiXonal	
  check	
  capability	
  for	
  
     delete	
  also	
  
   o 	
  Only	
  on	
  one	
  aIribute	
  !	
  
Eventual	
  Consistency	
  &	
  AMZ	
  –	
  S3	
  
•  S3	
  is	
  an	
  eventual	
  consistency	
  system	
  
    o Versioning	
  
    o “S3	
  PUT	
  &	
  COPY	
  synchronously	
  store	
  
      data	
  across	
  mulXple	
  faciliXes	
  before	
  
      returning	
  SUCCESS”	
  
    o Repair	
  Lost	
  redundancy,	
  repair	
  bit-­‐rot	
  
    o Reduced	
  Redundancy	
  opXon	
  for	
  data	
  
      that	
  can	
  be	
  reproduced	
  
      (99.999999999%	
  	
  vs.	
  99.99%)	
  	
  
        •  Approx	
  1/3rd	
  less	
  
    o CloudFront	
  for	
  caching	
  
!SQL	
  ?	
  
•  “We	
  conclude	
  that	
  the	
  current	
  RDBMS	
  code	
  lines,	
  while	
  
   aIempXng	
  to	
  be	
  a	
  “one	
  size	
  fits	
  all”	
  soluXon,	
  in	
  fact,	
  excel	
  at	
  
   nothing.	
  Hence,	
  they	
  are	
  25	
  year	
  old	
  legacy	
  code	
  lines	
  that	
  
   should	
  be	
  reXred	
  in	
  favor	
  of	
  a	
  collecXon	
  of	
  “from	
  scratch”	
  
   specialized	
  engines.”[43]	
  
•  “Current	
  systems	
  were	
  built	
  in	
  an	
  era	
  where	
  resources	
  were	
  
   incredibly	
  expensive,	
  and	
  every	
  compuXng	
  system	
  was	
  
   watched	
  over	
  by	
  a	
  collecXon	
  of	
  wizards	
  in	
  white	
  lab	
  coats,	
  
   responsible	
  for	
  the	
  care,	
  feeding,	
  tuning	
  and	
  opXmizaXon	
  of	
  
   the	
  system.	
  In	
  that	
  era,	
  computers	
  were	
  expensive	
  and	
  
   people	
  were	
  cheap”	
  
•  “The	
  1970	
  -­‐	
  1985	
  period	
  was	
  a	
  <me	
  of	
  intense	
  debate,	
  a	
  
   myriad	
  of	
  ideas,	
  &	
  considerable	
  upheaval.	
  We	
  predict	
  the	
  
   next	
  fiUeen	
  years	
  will	
  have	
  the	
  same	
  feel	
  “	
  
Further	
  deliberaXon	
  
•  Daniel	
  Abadi[45],Mike	
  Stonebreaker[46],	
  
   James	
  Hamilton[47],	
  Pat	
  Hilland[48]	
  are	
  all	
  
   good	
  read	
  for	
  further	
  deliberaXons	
  
NOSQL Internals & Algorithmics
Caveats	
  
•  A	
  representaXve	
  subset	
  of	
  the	
  mechanics	
  and	
  
   mechanisms	
  used	
  in	
  the	
  NOSQL	
  world	
  
•  Being	
  refined	
  &	
  newer	
  ones	
  are	
  being	
  tried	
  
•  At	
  a	
  system	
  level	
  –	
  to	
  show	
  how	
  the	
  techniques	
  
   play	
  a	
  part	
  to	
  deliver	
  a	
  capability	
  
•  The	
  NOSQL	
  Papers	
  and	
  other	
  references	
  for	
  
   further	
  deliberaXon	
  
•  Even	
  if	
  we	
  don’t	
  cover	
  fully,	
  it	
  is	
  OK.	
  I	
  want	
  to	
  
   introduce	
  some	
  of	
  the	
  concepts	
  so	
  that	
  you	
  get	
  
   an	
  appreciaXon	
  …	
  
NOSQL	
  Mechanics	
  
•  Horizontal	
  Scalability	
          •  Performance	
  
    –  Gossip	
  (Cluster	
                –  SStables/memtables	
  
       membership)	
                       –  LSM	
  w/Bloom	
  Filter	
  
    –  Failure	
  DetecXon	
            •  Integrity/Version	
  
    –  Consistent	
  Hashing	
             reconcilia<on	
  
    –  ReplicaXon	
                        –    Timestamps	
  
       Techniques	
  
                                           –    Vector	
  Clocks	
  
         •  Hinted	
  Handoff	
  
         •  Merkle	
  Trees	
              –    MVCC	
  
    –  Sharding	
  MongoDB	
               –    SemanXc	
  vs.	
  syntacXc	
  
                                                reconciliaXon	
  
    –  Regions	
  in	
  HBase	
  	
  
Consistent	
  Hashing	
  
•  Origin:	
  web	
  caching	
  “To	
  decrease	
  ‘hot	
  
   spots’	
  
•  Three	
  goals[87]	
  
    –  Smooth	
  evoluXon	
  
        •  When	
  a	
  new	
  machine	
  joins,	
  minimum	
  rebalance	
  
           work	
  and	
  impact	
  
    –  Spread	
  
        •  Objects	
  assigned	
  to	
  a	
  min	
  number	
  of	
  nodes	
  
    –  Load	
  
        •  #	
  of	
  disXnct	
  objects	
  assigned	
  to	
  a	
  node	
  is	
  small	
  
Consistent	
  Hashing	
  
•  Hash	
  Keyspace/Token	
  is	
  divided	
  into	
  parXXons/ranges	
  
•  Cassandra	
  –	
  choice	
  	
  
      –  OrderPreserving	
  parXXoner	
  –	
  key	
  =	
  token	
  (for	
  range	
  queries)	
  
      –  Also	
  saw	
  a	
  CollaXngOrderPreservingParXXoner	
  
•  ParXXons	
  assigned	
  to	
  nodes	
  that	
  are	
  logically	
  arranged	
  in	
  a	
  circle	
  
   topology	
  

•  Amz	
  (dynamo)	
  –	
  assign	
  sets	
  of	
  
   (random)	
  mulXple	
  points	
  to	
  
   different	
  machines	
  depending	
  on	
  
   load	
  
•  Cassandra	
  –	
  monitor	
  load	
  &	
  
   distribute	
  
•  Specific	
  join	
  &	
  leave	
  protocols	
  
•  ReplicaXon	
  –	
  next	
  3	
  consecuXve	
  
•  Cassandra	
  –	
  Rack-­‐aware,	
  
   Datacenter-­‐aware	
  
Consistent	
  Hashing	
  -­‐	
  Hinted-­‐handoff	
  
•  What	
  happens	
  when	
  a	
  node	
  is	
  not	
  available	
  ?	
  
    –  May	
  be	
  under	
  load	
  
    –  May	
  be	
  network	
  parXXon	
  
•  Sloppy	
  Quorum	
  &	
  Hinted-­‐handoff	
  
•  R/W	
  performed	
  on	
  the	
  1st	
  n	
  healthy	
  nodes	
  
•  Replica	
  sent	
  to	
  a	
  host	
  node	
  with	
  hint	
  in	
  
   metadata	
  &	
  then	
  transferred	
  when	
  the	
  actual	
  
   node	
  is	
  up	
  
•  Burdens	
  neighboring	
  nodes	
  
•  Cassandra	
  0.6.2	
  default	
  is	
  disabled	
  (I	
  think)	
  
Consistent	
  Hashing	
  -­‐	
  ReplicaXon	
  
•  What	
  happens	
  when	
  a	
  new	
  node	
  
   joins	
  ?	
  
   – It	
  gets	
  one	
  or	
  more	
  parXXons	
  
   – Dynamo	
  :	
  Copy	
  the	
  whole	
  parXXon	
  
   – Cassandra	
  :	
  Replicate	
  keyset	
  
   – Cassandra	
  :	
  working	
  on	
  a	
  bit	
  torrent	
  
     type	
  protocol	
  to	
  copy	
  from	
  replicas	
  
AnX-­‐entropy	
  
•  Merge	
  and	
  reconciliaXon	
  operaXons	
  
    –  Operate	
  on	
  two	
  states	
  and	
  return	
  a	
  new	
  state[86]	
  
•  Merkle	
  Trees	
  
    –  Dynamo	
  use	
  of	
  Merkle	
  trees	
  to	
  detect	
  
       inconsistencies	
  between	
  replicas	
  
    –  AnXEntropy	
  in	
  Cassandra	
  exchanges	
  Merkle	
  trees	
  
       and	
  if	
  they	
  disagree,	
  range	
  repair	
  via	
  compacXon
       [91,92]	
  
    –  Cassandra	
  uses	
  the	
  ScuIlebuI	
  ReconciliaXon[86]	
  
Gossip	
  
•  Membership	
  &	
  Failure	
  detecXon	
  
•  Based	
  on	
  emergence	
  without	
  rigidity	
  –	
  
   pulse	
  coupled	
  oscillators,	
  biological	
  
   systems	
  like	
  fireflies	
  ![90]	
  



•  Also	
  used	
  for	
  state	
  propagaXon	
  
   –  Used	
  in	
  Dynamo/Cassandra	
  
Gossip	
  
•  Cassandra	
  exchanges	
  heartbeat	
  state,	
  applicaXon	
  state	
  
   and	
  so	
  forth	
  
•  Every	
  second,	
  random	
  live	
  node,	
  random	
  unreachable	
  
   node	
  and	
  exchanges	
  key-­‐value	
  structures	
  
•  Some	
  nodes	
  play	
  the	
  part	
  of	
  seeds	
  
•  Seed	
  /iniXal	
  contact	
  points	
  in	
  staXc	
  conf	
  file	
  
   storage.conf	
  file	
  
•  Could	
  also	
  come	
  from	
  a	
  configuraXon	
  service	
  like	
  
   zookeeper	
  
•  To	
  guard	
  against	
  node	
  flap,	
  explicit	
  membership	
  join	
  and	
  
   leave	
  –	
  now	
  you	
  know	
  why	
  hinted	
  handoff	
  was	
  added	
  	
  
Membership	
  &	
  Failure	
  detecXon	
  
•  Consensus	
  &	
  Atomic	
  Broadcast	
  	
  -­‐	
  impossible	
  to	
  
   solve	
  in	
  a	
  distributed	
  system[88,89]	
  
     –  Cannot	
  differenXate	
  between	
  an	
  slow	
  system	
  and	
  a	
  
        crashed	
  system	
  	
  
•  Completeness	
  
     –  Every	
  system	
  that	
  crashed	
  will	
  be	
  eventually	
  
        detected	
  
•  Correctness	
  
     –  A	
  correct	
  process	
  is	
  never	
  suspected	
  
•  In	
  short,	
  if	
  you	
  are	
  dead	
  somebody	
  will	
  no<ce	
  it	
  
   and	
  if	
  you	
  are	
  alive,	
  nobody	
  will	
  mistake	
  you	
  for	
  
   dead	
  !	
  
Ø	
  Accrual	
  Failure	
  Detector	
  
•  Not	
  	
  Boolean	
  value	
  but	
  a	
  probabilisXc	
  number	
  that	
  “accrues”	
  over	
  
   an	
  exponenXal	
  scale	
  
•  Captures	
  the	
  degree	
  of	
  confidence	
  that	
  a	
  corresponding	
  monitored	
  
   process	
  has	
  crashed[94]	
  
     –  Suspicion	
  Level	
  
     –  Ø	
  =	
  1	
  -­‐>	
  prob(error)	
  10%	
  
     –  Ø	
  =	
  2	
  -­‐>	
  prob(error)	
  1%	
  
     –  Ø	
  =	
  3	
  -­‐>	
  prob(error)	
  0.1%	
  
•  If	
  process	
  is	
  dead,	
  	
  
     –  Ø	
  is	
  monotonically	
  increasing	
  &	
  Ø→α	
  as	
  t	
  →α	
  
•  If	
  process	
  is	
  alive	
  and	
  kicking,	
  Ø=0	
  
•  Account	
  for	
  lost	
  messages,	
  network	
  latency	
  and	
  actual	
  crash	
  of	
  
   system/process	
  
•  Well	
  known	
  heartbeat	
  period	
  Δi,	
  then	
  network	
  latency	
  Δtr	
  can	
  be	
  
   tracked	
  by	
  inter-­‐arrival	
  Xme	
  modeling	
  
Write/Read	
  Mechanisms	
  
•  Read	
  &	
  Write	
  to	
  a	
  random	
  node	
  
   (StorageProxy)	
  
•  Proxy	
  coordinates	
  the	
  read	
  and	
  write	
  
   strategy	
  (R/W	
  =	
  any,	
  quorum	
  et	
  al)	
  
•  Memtables/SSTables	
  from	
  big	
  table	
  
•  Bloom	
  Filter/Index	
  
•  LSM	
  Trees	
  
Hbase – WAL,
   Node                Write                Memstore, HDFS File
                                            system

                           Commit
                           Logs
                                                         Node
                                                                  M
                                                                  e
                                                                  m
                                                                  o
                MemTable                                          r
                                                                  y
                                                      Read

           Flushing



            Index              Index             Index
                                                                  D
                                                                  i
                      BF               BF                 BF      s
                                                                  k
SSTable
• Immutable
• Compaction
• Maintain Index & Bloom Filter
How…	
  does	
  HBase	
  work	
  again?	
  




           http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html
           http://hbaseblog.com/2010/07/04/hug11-hbase-0-90-preview-wrap-up/
Bloom	
  Filter	
  
•  The	
  BloomFilter	
  answers	
  the	
  quesXon	
  	
  
•  “Might	
  there	
  be	
  data	
  for	
  this	
  key	
  in	
  this	
  
   SSTable?”	
  [Ref:	
  Cassandra/Hbase	
  mailer]	
  
     –  “Maybe"	
  or	
  
     –  	
  “Definitely	
  not“	
  
     –  When	
  the	
  BloomFilter	
  says	
  "maybe"	
  we	
  have	
  to	
  go	
  to	
  
        disk	
  to	
  check	
  out	
  the	
  content	
  of	
  the	
  SSTable	
  
•  Depends	
  on	
  implementaXon	
  
     –  Redone	
  in	
  Cassandra	
  
     –  Hbase	
  0.20.x	
  removed,	
  will	
  be	
  back	
  in	
  0.90	
  with	
  a	
  
        “jazzy”	
  implementaXon	
  
Was it a vision, or a waking dream?
Fled is that music:—do I wake or sleep?
                 -Keats, Ode to a Nightingale
•    http://www.readwriteweb.com/enterprise/2011/11/infographic-data-
     deluge---8-ze.php
•    http://www.crn.com/news/data-center/232200061/efficiency-or-
     bust-data-centers-drive-for-low-power-solutions-prompts-channel-
     growth.htm
•    http://www.quantumforest.com/2011/11/do-we-need-to-deal-with-
     big-data-in-r/
•    http://www.forbes.com/special-report/2011/migration.html
•    http://www.mercurynews.com/bay-area-news/ci_19368103
•    http://www.businessinsider.com/apple-new-data-center-north-
     carolina-created-50-jobs-2011-11

More Related Content

Similar to The Art of Big Data

Big Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsBig Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsKrishna Sankar
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
May 2012 HUG: The Changing Big Data Landscape
May 2012 HUG: The Changing Big Data LandscapeMay 2012 HUG: The Changing Big Data Landscape
May 2012 HUG: The Changing Big Data LandscapeYahoo Developer Network
 
Bcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesBcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesPere Urbón-Bayes
 
Presentatie Big Data Forum 22 januari 2013 - Big Data en Big Society
Presentatie Big Data Forum 22 januari 2013 - Big Data en Big SocietyPresentatie Big Data Forum 22 januari 2013 - Big Data en Big Society
Presentatie Big Data Forum 22 januari 2013 - Big Data en Big SocietySURFnet
 
Information Visualization for Knowledge Discovery: An Introduction
Information Visualization for Knowledge Discovery: An IntroductionInformation Visualization for Knowledge Discovery: An Introduction
Information Visualization for Knowledge Discovery: An IntroductionKrist Wongsuphasawat
 
The causes and consequences of too many bits
The causes and consequences of too many bitsThe causes and consequences of too many bits
The causes and consequences of too many bitsDipesh Lall
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkKrishna Sankar
 
Big Data in small words
Big Data in small wordsBig Data in small words
Big Data in small wordsYogesh Tomar
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Alexandru Iosup
 
Not about the Big in Big Data
Not about the Big in Big DataNot about the Big in Big Data
Not about the Big in Big DataDataWorks Summit
 
Unexperienced pasts
Unexperienced pastsUnexperienced pasts
Unexperienced pastsBuhwan Jeong
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupalemmanuel_jamin
 
Big data présentation
Big data présentationBig data présentation
Big data présentationAbdo Bim
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)Dag Endresen
 

Similar to The Art of Big Data (20)

Big Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsBig Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 Pragmatics
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
May 2012 HUG: The Changing Big Data Landscape
May 2012 HUG: The Changing Big Data LandscapeMay 2012 HUG: The Changing Big Data Landscape
May 2012 HUG: The Changing Big Data Landscape
 
Bcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesBcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph Databases
 
Presentatie Big Data Forum 22 januari 2013 - Big Data en Big Society
Presentatie Big Data Forum 22 januari 2013 - Big Data en Big SocietyPresentatie Big Data Forum 22 januari 2013 - Big Data en Big Society
Presentatie Big Data Forum 22 januari 2013 - Big Data en Big Society
 
Information Visualization for Knowledge Discovery: An Introduction
Information Visualization for Knowledge Discovery: An IntroductionInformation Visualization for Knowledge Discovery: An Introduction
Information Visualization for Knowledge Discovery: An Introduction
 
The causes and consequences of too many bits
The causes and consequences of too many bitsThe causes and consequences of too many bits
The causes and consequences of too many bits
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
 
Big Data in small words
Big Data in small wordsBig Data in small words
Big Data in small words
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Not about the Big in Big Data
Not about the Big in Big DataNot about the Big in Big Data
Not about the Big in Big Data
 
Unexperienced pasts
Unexperienced pastsUnexperienced pasts
Unexperienced pasts
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupal
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data présentation
Big data présentationBig data présentation
Big data présentation
 
STI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
 
Parallel io
Parallel ioParallel io
Parallel io
 
geostack
geostackgeostack
geostack
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)
 

More from Krishna Sankar

Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data ScienceKrishna Sankar
 
An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXKrishna Sankar
 
An excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkAn excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkKrishna Sankar
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
Architecture in action 01
Architecture in action 01Architecture in action 01
Architecture in action 01Krishna Sankar
 
Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Krishna Sankar
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538Krishna Sankar
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk KnowledgeKrishna Sankar
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsKrishna Sankar
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesKrishna Sankar
 
AWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsAWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsKrishna Sankar
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonKrishna Sankar
 
Scrum debrief to team
Scrum debrief to team Scrum debrief to team
Scrum debrief to team Krishna Sankar
 
Precision Time Synchronization
Precision Time SynchronizationPrecision Time Synchronization
Precision Time SynchronizationKrishna Sankar
 
The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleKrishna Sankar
 
Nosql hands on handout 04
Nosql hands on handout 04Nosql hands on handout 04
Nosql hands on handout 04Krishna Sankar
 
Cloud Interoperability Demo at OGF29
Cloud Interoperability Demo at OGF29Cloud Interoperability Demo at OGF29
Cloud Interoperability Demo at OGF29Krishna Sankar
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0Krishna Sankar
 

More from Krishna Sankar (19)

Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 
An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphX
 
An excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkAn excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache Spark
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Architecture in action 01
Architecture in action 01Architecture in action 01
Architecture in action 01
 
Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive Bayes
 
AWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsAWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOps
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
Scrum debrief to team
Scrum debrief to team Scrum debrief to team
Scrum debrief to team
 
Precision Time Synchronization
Precision Time SynchronizationPrecision Time Synchronization
Precision Time Synchronization
 
The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to Kaggle
 
Nosql hands on handout 04
Nosql hands on handout 04Nosql hands on handout 04
Nosql hands on handout 04
 
Cloud Interoperability Demo at OGF29
Cloud Interoperability Demo at OGF29Cloud Interoperability Demo at OGF29
Cloud Interoperability Demo at OGF29
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0
 

Recently uploaded

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 

Recently uploaded (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 

The Art of Big Data

  • 1. The road lies plain before me;--'tis a theme Single and of determined bounds; … - Wordsworth, The Prelude m pre ss.co . word ol bl eclix te Scho p:/ /dou Gr adua 1 ka r, htt val Post 2 9,201 n a San r, Na Nov Krish in a st Sem hD Gue 00–P EC40
  • 2. What is Big Data ? Big Data to smart data Big o  Agenda Data o  To cover the broad Pipeline picture o  Understand the waypoints & o  Drill down into one area (NOSQL) Analytics/ Modeling Analytic Storage - R Algorithms NOSQL o  Can do others later … Processing - o  Of the Big Data Visualization Hadoop … domain …
  • 3. Thanks to … The giants whose shoulders I am standing on Special  Thanks  to:        Peter  Ateshian,  NPS        Prof  Murali  Tummala,  NPS        Shirley  Bailes,O’Reilly        Ed  Dumbill,O’Reilly        Jeff  Barr,AWS        Jenny  Kohr  Chynoweth,AWS  
  • 4. When I think of my own native land, In a moment I seem to be there; But, alas! recollection at hand Soon hurries me back to despair. - Cowper, The Solitude Of Alexander SelKirk
  • 5. What is Big Data ? “Big data” is data “Big data” is less that becomes large about size, more enough that it about flow & velocity cannot be processed - persisting using conventional petabytes per year is methods. @twitter easier than processing terabytes per hour. @twitter Ref:  hIp://radar.oreilly.com/2010/09/the-­‐smaq-­‐stack-­‐for-­‐big-­‐data.html  
  • 6. What is Big Data ? Vinod Khosla’s Cool Dozen!   Consumers : “Widespread innovation in technologies that reduce data overload for users” ~ Data Reduction   Businesses : “Simple solutions to handle the deluge of data generated from various sources …” ~ Big Data Analytics TV  2.0,  EducaXon,  Social  NEXT,Tools  for  sharing  inteerst,Publishing,…   Ref:  hIp://www.ciol.com/News/News/News-­‐Reports/Vinod-­‐Khosla%E2%80%99s-­‐cool-­‐dozen-­‐tech-­‐innovaXons/156307/0/   hIp://yourstory.in/2011/11/vinod-­‐khoslas-­‐keynote-­‐at-­‐nasscom-­‐product-­‐conclave-­‐reject-­‐punditry-­‐believe-­‐in-­‐an-­‐idea-­‐take-­‐risk-­‐and-­‐succeed/  
  • 7. EBC322     Volume o  Scale     Velocity o  Data  change  rate  vs.  decision  window     Variety o  Different  sources  &  formats   o  Structured  vs.  Unstructured     Variability o  Breadth  of  interpreta<on  &   o  Depth  of  analy<cs     Contextual o  Dynamic  variability   o  RecommendaXon     Connectedness hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/   hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf  
  • 8. EBC322     Volume o  Scale     Velocity o  Data  change  rate  vs.  decision  window     Variety o  Different  sources  &  formats   o  Structured  vs.  Unstructured     Variability o  Breadth  of  interpreta<on  &   o  Depth  of  analy<cs     Contextual o  Dynamic  variability   o  RecommendaXon     Connectedness hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/   hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf  
  • 9. EBC322     Volume o  Scale     Velocity o  Data  change  rate  vs.  decision  window     Variety o  Different  sources  &  formats   o  Structured  vs.  Unstructured     Variability o  Breadth  of  interpreta<on  &   o  Depth  of  analy<cs     Contextual o  Dynamic  variability   o  RecommendaXon     Connectedness hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/   hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf  
  • 10. EBC322     Volume o  Scale     Velocity o  Data  change  rate  vs.  decision  window     Variety o  Different  sources  &  formats   o  Structured  vs.  Unstructured     Variability o  Breadth  of  interpreta<on  &   o  Depth  of  analy<cs     Contextual o  Dynamic  variability   o  RecommendaXon     Connectedness hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/   hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf  
  • 11. EBC322     Volume o  Scale     Velocity o  Data  change  rate  vs.  decision  window     Variety o  Different  sources  &  formats   o  Structured  vs.  Unstructured     Variability o  Breadth  of  interpreta<on  &   o  Depth  of  analy<cs     Contextual o  Dynamic  variability   o  RecommendaXon     Connectedness hIp://doubleclix.wordpress.com/2011/09/13/when-­‐is-­‐big-­‐data-­‐really-­‐big-­‐data/   hIp://www.hpts.ws/posters/Poster2011_13_Bulkowski.pdf  
  • 12. I.  Two  Main  Types  –  based  on  collecXon   i.  Big  Data  Streams   o  Data  in  “moXon”   o  TwiIer  fire  hose,  Facebook,  G+     ii.  Big  Data  Logs   o  Data  “at  rest”   o  Logs,  DW,  external  market  data,  POS,  …   II.  Typically,  Big  Data  has  a  non-­‐determinisXc  angle  as  well  …   o  CreaXve  Discovery   o  IteraXve,  Model  based  AnalyXcs   o  Explore  quesXons  to  ask   III.  Smart  Data  =  Big  Data  +  context  +  embedded/interacXve  (inference,   reasoning)  models   o  Model  Driven   o  DeclaraXvely  InteracXve   hIp://www.slideshare.net/leonsp/hadoop-­‐slides-­‐11-­‐what-­‐is-­‐big-­‐data   hIp://www.slideshare.net/Dataversity/wed-­‐1550-­‐bacvanskivladimircolor  
  • 13. AWS – 600 Billion objects! Twitter §  200 million tweets/day §  Peak 10,000/second §  How would you handle the fire hose for social network analytics ? Zynga §  “Analytics company, not a gaming company!” §  Harvests data : 15 TB/day Storage §  Test new features §  4 U box = 40 TB, §  Target advertising 1 PB = 25 boxes ! §  §  230 million players/month hIp://goo.gl/dcBsQ  
  • 14. •  6  Billion  Messages  per   day   •  2  PB  (w/compression)   online   •  6  PB  w/  replicaXon   •  250  TB/Month  growth   •  HBase  Infrastructure  
  • 15. 50  TB/Day   Very  systemaXc   240  nodes,  84  PB   Diagram  speaks  volumes!   Path  Analysis   Teradata  InstallaXon   A/B  TesXng   Ref:  hIp://www.hpts.ws/sessions/2011HPTS-­‐TomFastner.pdf  
  • 16. •  “…  they  didn’t  need  a  genius,  …  but  build  the  world’s  most  impressive   dileIante  …  baIling  the  efficient  human  mind  with  spectacular   flamboyant  inefficiency”  –  Final  Jeopardy  by  Stephen  Baker   •  15  TB  memory,  across  90  IBM  760  servers,  in  10  racks   •  1  TB  of  dataset   •  200  Million  pages  processed  by  Hadoop   •  This  is  a  good  example  of  Connected  data   –  Contextual  w/  variability   –  Breath  of  interpretaXon   –  AnalyXcs  depth   hIp://doubleclix.wordpress.com/2011/03/01/the-­‐educaXon-­‐of-­‐a-­‐machine-­‐%E2%80%93-­‐review-­‐of-­‐book-­‐%E2%80%9Cfinal-­‐jeopardy %E2%80%9D-­‐by-­‐stephen-­‐baker/   hIp://doubleclix.wordpress.com/2011/02/17/watson-­‐at-­‐jeopardy-­‐a-­‐race-­‐of-­‐machines/  
  • 17. Warehouse-­‐style   ApplicaXons   Block  Store   Distributed   Big Data ApplicaXons   Storage   Object  Store   NOSQL   AnalyXcs   Parallelism   Map/Reduce   Web   HPC   AnalyXcs   Cloud   Architecture   Social  Media   Log   Inference   AnalyXcs   Social     RecommendaXon/ Graph   Inference  Engines   Machine   Knowledge   Search,   Learning   Mahout   Graph   Indexing   ClassificaXon,  Clustering  
  • 18. “A towel is about the most massively useful thing an interstellar hitchhiker can have … any man who can hitch the length and breadth of the Galaxy, rough it … win through, and still know where his towel is, is clearly a man to be reckoned with.” - From The Hitchhiker's Guide to the Galaxy, by Douglas Adams. Published by Harmony Books in 1979 Big  Data  to  Smart  Data
  • 19. Don’t  throw  away   1 any  data  ! Big  data  to  smart  data Be  ready  for  different   2 ways  of  organizing   the  data •  summary h;p://goo.gl/fGw7r
  • 20. Big  Data  Pipeline If a problem has no solution, it is not a problem, but a fact, not to be solved but to be coped with, over time … - Peres’s Law
  • 21. Big  Data  Pipeline •  Stages o  Collect o  Store o  Transform & Analyze o  Model & Reason o  Predict, Recommend & Visualize •  Different systems have different characteristics o  Infrastructure optimization based in application/hardware attributes correlation (short term) •  Hadoop, Splunk, internal Dashboard o  Application performance trends (medium term) •  Analytics, Modeling,… o  Product Metrics •  Feature set vs. usage, what is important to users, stratification •  Modeling using R, Visualization layers like Tableau
  • 22. Big Data Pipeline Ref:h;p:goo.gl/Mm83k Infer-ability Model Internal   dashboards Hand   ,  Tableau   Context coded     Programs,   Connectedness R,  Mahout,   …   SQL,       Variety BI  Tools,   Hadoop,   Pig,   Variability SQL   Hive,     .NET   NOSQL,   Logs,   Dryad,   Velocity Scribe,   HDFS,   XML,   Various   Flume,   other   <iles,  …   Volume Hadoop   tools   …   Decomplexify! Contextualize! Network! Reason! Infer!
  • 23. Build to Fail - “It is working” is not binary The  NOSQL  ! I AM monarch of all I survey; My right there is none to dispute; From the centre all round to the sea I am lord of the fowl and the brute - Cowper, The Solitude Of Alexander SelKirk
  • 24. Agenda •  Opening Gambit –  NOSQL  :  Toil,  Tears  &  Sweat  !   •  The Pragmas –  ABCs  of  NOSQL  [ACID,  BASE  &  CAP]   •  The Mechanics –  Algorithmics  &  Mechanisms  (For  reference)   Referenced Links @ http://doubleclix.wordpress.com/2010/06/20/nosql-talk-references/
  • 25. What is NOSQL Anyway ? •  NOSQL    !=  NoSQL  or  NOSQL  !=  (!SQL)   •  NOSQL  =  Not  Only  SQL   •  Can  be  traced  back  to  Eric  Evans[2]!   –  You  can  ask  him  during  the  ayernoon  session!   •  Unfortunate  Name,  but  is  stuck  now   •  Non  RelaXonal  could  have  been  beIer   •  Usually  OperaXonal,  Definitely  Distributed   •  NOSQL  has  certain  semanXcs  –  need  not  stay  that  way  
  • 26. NOSQL   Key  Value   Column   Document   Graph   In-­‐memory   SimpleDB   CouchDB   Neo4j   Memcached   Google   MongoDB   FlockDB   BigTable   Disk  Based   HBase   Lotus  Domino   InfiniteGraph   Redis   Cassandra   Riak   Tokyo  Cabinet   Dynamo   HyperTable   Voldemort   Azure  TS   Ref:  [22,51,52]  
  • 27. When I think of my own native land, In a moment I seem to be there; But, alas! recollection at hand Soon hurries me back to despair. - Cowper, The Solitude Of Alexander SelKirk NOSQL Tales from the field WHAT WORKS
  • 28. •  Designer Augmenting RDBMS with a Distributed key Value Store[40 : A good talk by Geir] •  Invitation only designer brand sales •  Limited inventory sales – start at 12:00, members have 10 min to grab them. 500K mails every day •  Keeps brand value, hidden from search •  Interesting load properties •  Each item a row in DB-BUY NOW reserves it –  Can't order more •  Started out as a Rails app –  shared nothing •  Narrow peaks – half of revenue
  • 29. Christian Louboutin Effect •  ½ amz for Louboutin •  Use Voldemort •  Inventory, Shopping Cart, Checkout •  Partition by prod ID •  Shared infrastructure – “fog” not “cloud’ - Joyent! •  In-memory inventory •  Not afraid of sale anymore! And SQL DBs are still relevant !
  • 30. Typical NOSQL Example Bit.ly •  Bit,ly URL shortening service, uses MongoDB •  User, title, URL, hash, labels[I-5], sort by time •  Scale – ~50M users, ~10K concurrent, ~1.25B shortens per month •  Criteria: –  Simple, Zippy FAST, Very Flexible, Reasonable Durability, Low cost of ownership •  Sharded by userid
  • 31. •  New kind of “dictionary” a word repository, GPS for English – context, pronunciations, twitter … developer API •  Characteristics[I-6,Tony Tam’s presentation] –  RO-centric, 10,000 reads for every write –  Hit a wall with MySQL (4B rows) –  MongoDB read was so good that memcached layer was not required –  MongoDB used 4 times MySQL storage •  Another example : –  Voldemort – Unified Communications, IP-Phone data stored keyed off of phone number. Data relatively stable
  • 32. Large Hadron Collider@CERN •  DAS is part of giant data management enterprise (cms) –  Polygot Persistence (SQL + NOSQL, Mongo, Couch, memcache, HDFS, Luster, Oracle, mySQL, …) •  Data Aggregation System [I-1,I-2,I-3,I-4] –  Uses MongoDB –  Distributed Model, 2-6 pb data –  Combine info. from different metadata sources, query without knowing their existence, user has domain knowledge – but shouldn’t deal with various formats, interfaces and query semantics –  DAS aggregates, caches and presents data as JSON documents – preserving security & integrity And SQL DBs are still relevant !
  • 34. •  Digg –  RDBMS places burden on reads than writes[I-8] –  Looked at NOSQL, selected Cassandra •  Colum oriented, so more structure than key-value •  Heard from noSQL Boston[http://twitter.com/ #search?q=%23nosqllive] –  Baidu: 120 node HyperTable cluster managing 600TB of data –  StumbleUpon uses HBase for Analytics –  Twitter’s Current Cassandra cluster: 45 nodes
  • 35. •  Adob is a HBase shop •  BBC is a CouchDB shop [I-10,I-11,2] [I-13] •  Adobe SaaS Infrastructure – •  Sweet spot: tagging, content aggregation, •  Multi-master, multi search, storage and so forth datacenter replication •  Dynamic schema & huge number of records[I-5] •  40 million records in 2008 to 1 billion with 50 ms response •  Interactive Mediums •  NOSQL not mature in 2008, •  Old data to CouchDB now good enough •  Thus free up DB to do •  Prod Analytics:40 nodes, work! largest has 100 nodes
  • 36. •  Cloudkick is a Cassandra shop[I-12] •  Cloudkick offers cloud management services •  Store metrics data •  Linear scalability for write load •  Massive write performance •  Memory table & serial commit log •  Low operational costs •  Data Structure –  Metrics, Rolled-up data, Statuses at time slice : all indexed by timestamp
  • 37. •  Guardian/UK –  Runs on Redis[I-14] ! –  “Long-term The Guardian is looking towards the adoption of a schema-free database to sit alongside its Oracle database and is investigating CouchDB. … the relational database is now just a component in the overall data management story, alongside data caching, data stores, search engines And SQL DBs are etc. still relevant ! –  NOSQL can increase performance of "The evil that SQL relational data by offloading specific DBs do lives after data and tasks them; the good is oft interred with their bones...",
  • 38. NOSQL at Netflix •  Netflix is fully in the cloud •  Uses NOSQL across the globe •  Customer Profiles, watchlog, usage logging (see next slide) –  No multi-record locking •  No DBA ! •  Easier Schema Changes •  Less complex, Highly Available data store •  Joins happen in the applications http://www.hpts.ws/sessions/nosql-ecosystem.pdf http://www.hpts.ws/sessions/GlobalNetflixHPTS.pdf
  • 39.
  • 40. 21 NOSQL Themes •  Web  Scale   •  Scale  Incrementally/conXnuous  growth   •  Oddly  shaped  &  exponenXally  connected   •  Structure  data  as  it  will  be  used  –  i.e.  read,  query   •  Know  your  queries/updates  in  advance[96],  but  you  can  change   them  later   •  Compute  aIributes  at  run  Xme   •  Create  a  few  large  enXXes  with  opXonal  parts   –  NormalizaXon  creates  many  small  enXXes   •  Define  Schemas  in  models  (not  in  databases)   •  Avoid  impedance  mismatch   •  Narrow  down  &  solve  your  core  problem   •  Solve  the  right  problem  with  the  right  tool   Ref:  [I-­‐8]  
  • 41. 21 NOSQL Themes •  ExisXng  soluXons  are  clunky[1]  (in  certain  situaXons)   •  Scale  automaXcally,  “becoming  prohibiXvely  costly  (in   terms  of  manpower)  to  operate”  TwiIer[I-­‐9]     •  DistribuXon  &  parXXoning  are  built-­‐in  NOSQL   •  RDBMS  distribuXon  &  sharding  not  fun  and  is  expensive   –  Lose  most  funcXonality  along  the  way   •  Data  at  the  center,  Flexible  schema,  Less  joins   •  The  value  of  NOSQL  is  in  flexibility  as  much  as  it  is  in  “Big   Data”  
  • 42. 21 NOSQL Themes •  Requirements[3]   –  Data  will  not  fit  in  one  node   •  And  so  need  data  parXXon/distribuXon  by  the  system   –  Nodes  will  fail,  but  data  needs  to  be  safe  –  replicaXon!   –  Low  latency  for  real-­‐Xme  use   •  Data  Locality   –  Row  based  structures  will  need  to  read  whole  row,   even  for  a  column   –  Column  based  structures  need  to  scan  for  each  row   •  SoluXon  :  Column  storage  with  Locality     –  Keep  data  that  is  read  together,  don’t  read  what  you   don’t  care   •  For  example  friends  –  other  data   Ref:  3  
  • 43. ABCs of NOSQL - ACID, BASE & CAP The woods are lovely, dark, and deep, But I have promises to keep, And miles to go before I sleep, And miles to go before I sleep. -Frost
  • 44. CAP Principle “CAP  Principle  →      Strong  Consistency,      High  Availability,     Consistency  Par::on-­‐resilience:     Pick  at  most  2”[37] Availability Partition Which  feature  to  discard  depends  on  the  nature  of  your  system[41]  
  • 45. CAP Principle “CAP  Principle  →      Strong  Consistency,      High  Availability,     Consistency  Par::on-­‐resilience:     Pick  at  most  2”[37]   C-­‐A  No  P  →  Single  DB   server,  no  network  par::on   Availability Partition Which  feature  to  discard  depends  on  the  nature  of  your  system[41]  
  • 46. CAP Principle “CAP  Principle  →      Strong  Consistency,      High  Availability,     Consistency  Par::on-­‐resilience:     Pick  at  most  2”[37]   C-­‐P  No  A  →  Block   transac:on  in   case  of  par::on   failure   Availability Partition Which  feature  to  discard  depends  on  the  nature  of  your  system[41]  
  • 47. CAP Principle Interesting (& controversial) from “CAP  Principle  →     NOSQL perspective  Strong  Consistency,      High  Availability,     Consistency  Par::on-­‐resilience:     Pick  at  most  2”[37]   A-­‐P  No  C  →   Expira:on  based   caching,  vo:ng   majority   Availability Partition
  • 48. ABCs  of  NOSQL   •  ACID   o  Atomicity,  Consistency,  IsolaXon  &  Durability  –   fundamental  properXes  of  SQL  DBMS   •  BASE[35,39]   o  Basically  Available  Soy  state(Scalable)   Eventually  Consistent     •  CAP[36,39]   o  Consistency,  Availability  &  ParXXoning   o  This  C  is  ~A+C   •  i.e.  Atomic  Consistency[36]  
  • 49. ACID   •  Atomicity   o  All  or  nothing   •  Consistent   o  From  one  consistent  state  to  another   •  e.g.  ReferenXal  Integrity   o  But  it  is  also  applicaXon  dependent  on     •  e.g.  min  account  balance   •  Predicates,  invariants,…   •  IsolaXon   •  Durability  
  • 50. CAP  Pragmas   •  PrecondiXons   o  The  domain  is  scalable  web  apps   o  Low  Latency  For  real  Xme  use   o  A  small  sub-­‐set  of  SQL  FuncXonality   o  Horizontal  Scaling   •  PritcheI[35]  talks  about  relaxing  consistency   across  funcXonal  groups  than  within  funcXonal   groups   •  Idempotency  to  consider   o  Updates  inc/dec  are  rarely  idempotent   o  Order  preserving  trx  are  not  idempotent  either   o  MVCC  is  an  answer  for  this  (CouchDB)  
  • 51. Consistency   •  Strict  Consistency   o Any  read  on  Data  X  will  return  the  most   recent  write  on  X[42]   •  SequenXal  Consistency   o Maintains  sequenXal  order  from   mulXple  processes  (No  menXon  of  Xme)   •  Linearizability   o Add  Xmestamp  from  loosely   synchronized  processes  
  • 52. Consistency   •  Write  availability,  not  read  availability[44]   •  Even  load  distribuXon  is  easier  in   eventually  consistent  systems   •  MulX-­‐data  center  support  is  easier  in   eventually  consistent  systems   •  Some  problems  are  not  solvable  with   eventually  consistent  systems   •  Code  is  someXmes  simpler  to  write  in   strongly  consistent  systems  
  • 53. CAP  EssenXals  –  1  of  3   •  “CAP  Principle  →  Strong  Consistency,  High   Availability,  ParXXon-­‐resilience:  Pick  at   most  2”[37]   o  C-­‐A  No  P  →  Single  DB  server,  no  network   parXXon   o  C-­‐P  No  A  →  Block  transacXon  in  case  of   parXXon  failure   o  A-­‐P  No  C  →  ExpiraXon  based  caching,  voXng   majority   •  Which  feature  to  discard  depends  on  the   nature  of  your  system[41]  
  • 54. CAP  EssenXals  –  2  of  3   •  Yield  vs.  Harvest[37]   o  Yield  →  Probability  of  compleXng  a  request   o  Harvest  →  FracXon  of  data  reflected  in  the   response   •  Some  systems  tolerate  <  100%  harvest  (e.g   search  i.e.  approximate  answers  OK)   others  need  100%  harvest  (e.g.  Trx  i.e.   correct  behavior  =  single  well  defined   response)   •  For  sub-­‐systems  that  tolerate  harvest   degradaXon,  CAP  makes  sense      
  • 55. CAP  EssenXals  –  3  of  3   •  Trading  Harvest  for  yield  –  AP   •  ApplicaXon  decomposiXon  &  use  NOSQL  in   appropriate  sub-­‐systems  that  has  state   management  and  data  semanXcs  that  match  the   opera<onal  feature  &  impedance   o  Hence  NotOnly  SQL  not  No  SQL   o  Intelligent  homing  to  tolerate  parXXon  failures[44]   o  MulX  zones  in  a  region  (150  miles  -­‐  5  ms)   o  TwiIer  tweets  in  Cassandra  &  MySQL   o  BBC  using  MongoDB  for  offloading  DBMS   o  Polygot  persistence  at  LHC@CERN  
  • 56. CAP  EssenXals  –  3  of  3   •  Trading  Harvest  for  yield  –  AP   •  ApplicaXon  decomposiXon  &  use  NOSQL  in   appropriate  sub-­‐systems  that  has  state   management  and  data  semanXcs  that  match  the   opera<onal  feature  &  impedance   o  Hence  NotOnly  SQL  not  No  SQL   o  Intelligent  homing  to  tolerate  parXXon  failures[44]   o  MulX  zones  in  a  region  (150  miles  -­‐  5  ms)   o  TwiIer  tweets  in  Cassandra  and  MySQL   Most important o  BBC  using  MongoDB  for  offloading  DBMS   point in the whole o  Polygot  persistence  at  LHC@CERN   presentation
  • 57. Eventual  Consistency  &  AMZ   •  DistribuXon  Transparency[38]   •  Larger  distributed  systems,  network   parXXons  are  given   •  Consistency  Models   o  Strong   o  Weak   •  Has  an  inconsistency  window  before  update  and   guaranteed    view   o  Eventual   •  If  no  new  updates,  all  will  see  the  value,  eventually  
  • 58. Eventual  Consistency  &  AMZ   •  Guarantee  variaXons[38]   o Read-­‐Your-­‐writes   o Session  consistency   o Monotonic  Read  consistency   •  Access  will  not  return  previous  value   o Monotonic  Write  consistency   •  Serialize  write  by  the  same  process   •  Guarantee  order  (vector  clocks,   mvcc)   o  Example  :  Amz  Cart  merger  (let  cart  add  even  with  parXal   failure)  
  • 59. Eventual  Consistency  &  AMZ  -­‐  SimpleDB   •  SimpleDB  strong  consistency   semanXcs  [49,50]     o UnXl  Feb  2010,  SimpleDB  only   supported  eventual  consistency  i.e.   GetAIributes  ayer  PutAIributes  might   not  be  the  same  for  some  Xme  (1   second)   o On  Feb  24,  AWS  Added   ConsistentRead=True  aIribute  for  read   o Read  will  reflect  all  writes  that  got   200OK  Xll  that  Xme!  
  • 60. Eventual  Consistency  &  AMZ  -­‐  SimpleDB   •  SimpleDB  strong  consistency   semanXcs  [49,50]     o Also  added  condiXonal  put/delete   o Put  aIribute  has  a  specified  value   (Expected.1.Value=)  or  (Expected. 1.Exists  =  true/false)   o Same  condiXonal  check  capability  for   delete  also   o   Only  on  one  aIribute  !  
  • 61. Eventual  Consistency  &  AMZ  –  S3   •  S3  is  an  eventual  consistency  system   o Versioning   o “S3  PUT  &  COPY  synchronously  store   data  across  mulXple  faciliXes  before   returning  SUCCESS”   o Repair  Lost  redundancy,  repair  bit-­‐rot   o Reduced  Redundancy  opXon  for  data   that  can  be  reproduced   (99.999999999%    vs.  99.99%)     •  Approx  1/3rd  less   o CloudFront  for  caching  
  • 62. !SQL  ?   •  “We  conclude  that  the  current  RDBMS  code  lines,  while   aIempXng  to  be  a  “one  size  fits  all”  soluXon,  in  fact,  excel  at   nothing.  Hence,  they  are  25  year  old  legacy  code  lines  that   should  be  reXred  in  favor  of  a  collecXon  of  “from  scratch”   specialized  engines.”[43]   •  “Current  systems  were  built  in  an  era  where  resources  were   incredibly  expensive,  and  every  compuXng  system  was   watched  over  by  a  collecXon  of  wizards  in  white  lab  coats,   responsible  for  the  care,  feeding,  tuning  and  opXmizaXon  of   the  system.  In  that  era,  computers  were  expensive  and   people  were  cheap”   •  “The  1970  -­‐  1985  period  was  a  <me  of  intense  debate,  a   myriad  of  ideas,  &  considerable  upheaval.  We  predict  the   next  fiUeen  years  will  have  the  same  feel  “  
  • 63. Further  deliberaXon   •  Daniel  Abadi[45],Mike  Stonebreaker[46],   James  Hamilton[47],  Pat  Hilland[48]  are  all   good  read  for  further  deliberaXons  
  • 64. NOSQL Internals & Algorithmics
  • 65. Caveats   •  A  representaXve  subset  of  the  mechanics  and   mechanisms  used  in  the  NOSQL  world   •  Being  refined  &  newer  ones  are  being  tried   •  At  a  system  level  –  to  show  how  the  techniques   play  a  part  to  deliver  a  capability   •  The  NOSQL  Papers  and  other  references  for   further  deliberaXon   •  Even  if  we  don’t  cover  fully,  it  is  OK.  I  want  to   introduce  some  of  the  concepts  so  that  you  get   an  appreciaXon  …  
  • 66. NOSQL  Mechanics   •  Horizontal  Scalability   •  Performance   –  Gossip  (Cluster   –  SStables/memtables   membership)   –  LSM  w/Bloom  Filter   –  Failure  DetecXon   •  Integrity/Version   –  Consistent  Hashing   reconcilia<on   –  ReplicaXon   –  Timestamps   Techniques   –  Vector  Clocks   •  Hinted  Handoff   •  Merkle  Trees   –  MVCC   –  Sharding  MongoDB   –  SemanXc  vs.  syntacXc   reconciliaXon   –  Regions  in  HBase    
  • 67. Consistent  Hashing   •  Origin:  web  caching  “To  decrease  ‘hot   spots’   •  Three  goals[87]   –  Smooth  evoluXon   •  When  a  new  machine  joins,  minimum  rebalance   work  and  impact   –  Spread   •  Objects  assigned  to  a  min  number  of  nodes   –  Load   •  #  of  disXnct  objects  assigned  to  a  node  is  small  
  • 68. Consistent  Hashing   •  Hash  Keyspace/Token  is  divided  into  parXXons/ranges   •  Cassandra  –  choice     –  OrderPreserving  parXXoner  –  key  =  token  (for  range  queries)   –  Also  saw  a  CollaXngOrderPreservingParXXoner   •  ParXXons  assigned  to  nodes  that  are  logically  arranged  in  a  circle   topology   •  Amz  (dynamo)  –  assign  sets  of   (random)  mulXple  points  to   different  machines  depending  on   load   •  Cassandra  –  monitor  load  &   distribute   •  Specific  join  &  leave  protocols   •  ReplicaXon  –  next  3  consecuXve   •  Cassandra  –  Rack-­‐aware,   Datacenter-­‐aware  
  • 69. Consistent  Hashing  -­‐  Hinted-­‐handoff   •  What  happens  when  a  node  is  not  available  ?   –  May  be  under  load   –  May  be  network  parXXon   •  Sloppy  Quorum  &  Hinted-­‐handoff   •  R/W  performed  on  the  1st  n  healthy  nodes   •  Replica  sent  to  a  host  node  with  hint  in   metadata  &  then  transferred  when  the  actual   node  is  up   •  Burdens  neighboring  nodes   •  Cassandra  0.6.2  default  is  disabled  (I  think)  
  • 70. Consistent  Hashing  -­‐  ReplicaXon   •  What  happens  when  a  new  node   joins  ?   – It  gets  one  or  more  parXXons   – Dynamo  :  Copy  the  whole  parXXon   – Cassandra  :  Replicate  keyset   – Cassandra  :  working  on  a  bit  torrent   type  protocol  to  copy  from  replicas  
  • 71. AnX-­‐entropy   •  Merge  and  reconciliaXon  operaXons   –  Operate  on  two  states  and  return  a  new  state[86]   •  Merkle  Trees   –  Dynamo  use  of  Merkle  trees  to  detect   inconsistencies  between  replicas   –  AnXEntropy  in  Cassandra  exchanges  Merkle  trees   and  if  they  disagree,  range  repair  via  compacXon [91,92]   –  Cassandra  uses  the  ScuIlebuI  ReconciliaXon[86]  
  • 72. Gossip   •  Membership  &  Failure  detecXon   •  Based  on  emergence  without  rigidity  –   pulse  coupled  oscillators,  biological   systems  like  fireflies  ![90]   •  Also  used  for  state  propagaXon   –  Used  in  Dynamo/Cassandra  
  • 73. Gossip   •  Cassandra  exchanges  heartbeat  state,  applicaXon  state   and  so  forth   •  Every  second,  random  live  node,  random  unreachable   node  and  exchanges  key-­‐value  structures   •  Some  nodes  play  the  part  of  seeds   •  Seed  /iniXal  contact  points  in  staXc  conf  file   storage.conf  file   •  Could  also  come  from  a  configuraXon  service  like   zookeeper   •  To  guard  against  node  flap,  explicit  membership  join  and   leave  –  now  you  know  why  hinted  handoff  was  added    
  • 74. Membership  &  Failure  detecXon   •  Consensus  &  Atomic  Broadcast    -­‐  impossible  to   solve  in  a  distributed  system[88,89]   –  Cannot  differenXate  between  an  slow  system  and  a   crashed  system     •  Completeness   –  Every  system  that  crashed  will  be  eventually   detected   •  Correctness   –  A  correct  process  is  never  suspected   •  In  short,  if  you  are  dead  somebody  will  no<ce  it   and  if  you  are  alive,  nobody  will  mistake  you  for   dead  !  
  • 75. Ø  Accrual  Failure  Detector   •  Not    Boolean  value  but  a  probabilisXc  number  that  “accrues”  over   an  exponenXal  scale   •  Captures  the  degree  of  confidence  that  a  corresponding  monitored   process  has  crashed[94]   –  Suspicion  Level   –  Ø  =  1  -­‐>  prob(error)  10%   –  Ø  =  2  -­‐>  prob(error)  1%   –  Ø  =  3  -­‐>  prob(error)  0.1%   •  If  process  is  dead,     –  Ø  is  monotonically  increasing  &  Ø→α  as  t  →α   •  If  process  is  alive  and  kicking,  Ø=0   •  Account  for  lost  messages,  network  latency  and  actual  crash  of   system/process   •  Well  known  heartbeat  period  Δi,  then  network  latency  Δtr  can  be   tracked  by  inter-­‐arrival  Xme  modeling  
  • 76. Write/Read  Mechanisms   •  Read  &  Write  to  a  random  node   (StorageProxy)   •  Proxy  coordinates  the  read  and  write   strategy  (R/W  =  any,  quorum  et  al)   •  Memtables/SSTables  from  big  table   •  Bloom  Filter/Index   •  LSM  Trees  
  • 77. Hbase – WAL, Node Write Memstore, HDFS File system Commit Logs Node M e m o MemTable r y Read Flushing Index Index Index D i BF BF BF s k SSTable • Immutable • Compaction • Maintain Index & Bloom Filter
  • 78. How…  does  HBase  work  again?   http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html http://hbaseblog.com/2010/07/04/hug11-hbase-0-90-preview-wrap-up/
  • 79. Bloom  Filter   •  The  BloomFilter  answers  the  quesXon     •  “Might  there  be  data  for  this  key  in  this   SSTable?”  [Ref:  Cassandra/Hbase  mailer]   –  “Maybe"  or   –   “Definitely  not“   –  When  the  BloomFilter  says  "maybe"  we  have  to  go  to   disk  to  check  out  the  content  of  the  SSTable   •  Depends  on  implementaXon   –  Redone  in  Cassandra   –  Hbase  0.20.x  removed,  will  be  back  in  0.90  with  a   “jazzy”  implementaXon  
  • 80. Was it a vision, or a waking dream? Fled is that music:—do I wake or sleep? -Keats, Ode to a Nightingale
  • 81. •  http://www.readwriteweb.com/enterprise/2011/11/infographic-data- deluge---8-ze.php •  http://www.crn.com/news/data-center/232200061/efficiency-or- bust-data-centers-drive-for-low-power-solutions-prompts-channel- growth.htm •  http://www.quantumforest.com/2011/11/do-we-need-to-deal-with- big-data-in-r/ •  http://www.forbes.com/special-report/2011/migration.html •  http://www.mercurynews.com/bay-area-news/ci_19368103 •  http://www.businessinsider.com/apple-new-data-center-north- carolina-created-50-jobs-2011-11