SlideShare una empresa de Scribd logo
1 de 37
NoSQL


   By Zenyk Matchyshyn
   Staff Engineer, Lohika
                        1
Agenda
 •   History
 •   Architecture vs Technology
 •   Classification
 •   Pros and Cons of usage
 •   Trends
 •   Q/A




                                  2
HISTORY


          3
4
History
 •   NoSQL Technologies are not new
 •   Many ideas originate from distributed
     computing, grid computing and parallel
     computing
 •   Main drivers:
     •   Scalability
     •   Parallelization
     •   Costs


                                              5
Google
 •   In the beginning… there was Google!
 •   Google shared scientific papers:
     •   “The Google File System”, October 2003
     •   “MapReduce: Simplified Data Processing on
         Large Clusters”, December 2004
     •   “Bigtable: A Distributed Storage System for
         Structured Data”, November 2006
     •   “The Chubby Lock Service for Loosely-
         Coupled Distributed Systems”, November
         2006
                                                       6
Amazon

 •   … and Amazon!
 •   “Dynamo: Amazon Highly Available key/value
     Store”, October 2007




                                                  7
New technologies!


 •   Creators of Lucene wanted to create a full
     search solution
 •   Ended up with Hadoop and Hadoop
     Distributed File System (HDFS)
 •   Success helped adoption and new solutions
     emerged




                                                  8
ARCHITECTURE VS TECHNOLOGY



                             9
Architecture vs Technology


 •   SQL is not bad, it’s just different
 •   You can use SQL DB in NoSQL way, e.g.
     MySQL as a key-value database
 •   You can do SQL queries on Hadoop data




                                             10
Architecture


 •   The way you store data
 •   The way you query data
 •   Technology environment




                              11
CLASSIFICATION


                 12
Terms


 •   ACID – Atomicity, Consistency, Isolation,
     Durability
 •   CAP Theorem – Consistency, Availability,
     Partition tolerance
 •   Eventual consistency
 •   Hashing
 •   Schema


                                                 13
Classification


 •   Column oriented stores
 •   Key/Value stores
 •   Key/Value stores with configurable
     consistency
 •   Document stores
 •   Graph stores



                                          14
Chart



                            memcached
Scalability & Performance




                                   Key/value
                                                 Column
                                                 oriented
                                                                   Document
                                                                     store




                                                                              RDBMS




                                               Depth of Functionality


                                                                                      15
Column oriented
 •   Based on Google Bigtable
 •   Column oriented is a revers of Row oriented
 •   Assumption is that datacenters are
     transcontinental and connected using
     standard Internet
 •   C and P from CAP Theorem
 •   Data consistent and partitioned but trouble
     with availability


                                                   16
HBase
 •   Spin off from Hadoop project -
     http://hbase.apache.org/
 •   Written in Java
 •   A lot of interfaces – Thrift, REST, JRuby, etc.
 •   SQL-like access through Hive -
     http://hive.apache.org/
 •   HBase ORM – Surus -
     https://github.com/mushkevych/surus
 •   Used by Facebook, Hulu, Yahoo!, Ning, etc.

                                                       17
Hypertable
 •   Developed by Zvents, open sourced
 •   Written in C++
 •   Running on top of distributed file system
 •   Used by Baidu




                                                 18
Key/Value


 •   Key/Value Store – Oracle Berkley DB (Oracle
     NoSQL), Redis, Kyoto Cabinet
 •   Can store strings, arrays, hashes




                                               19
Oracle NoSQL
 •   Sign of things to come!
 •   http://www.oracle.com/technetwork/database/
     nosqldb/overview/index.html
 •   Written in Java
 •   Configurable consistency
 •   BerkleyDB as a backend
 •   No single node of failure
 •   Transactions

                                               20
Redis

 •   http://redis.io/
 •   Lots of bindings
 •   Written in C
 •   In-memory, with optional durability
 •   Also a document store




                                           21
Key/Value – eventual consistency
 •   K/V Availability over Consistency
 •   Inspired by Amazon Dynamo
 •   Dynamo based on assumption of high speed
     network links between data centers and
     datacenters are close to each other
 •   A and P from CAP Theorem
 •   Achieve eventual consistency through
     replication and verification
 •   Consistency is eventual
                                                22
Cassandra
 •   http://cassandra.apache.org/
 •   Multidimensional map indexed by key
 •   No single point of failure
 •   Decentralized
 •   Tunable consistency
 •   Used by Facebook, Cisco, IBM, Rackspace




                                               23
Voldemort
 •   http://project-voldemort.com/
 •   Developed by LinkedIn
 •   Written in Java
 •   Developers oriented – a lot of modules are
     pluggable
 •   Strictly key/value




                                                  24
Document stores

 •   Document Databases
 •   Document oriented stores are semi structured
 •   Mostly JSON oriented
 •   Also called schema free rows
 •   Can query by field




                                                25
MongoDB

 •   http://www.mongodb.org/
 •   Schema-free, document-oriented
 •   Written in C++
 •   Lots of interfaces
 •   JSON documents
 •   Query language, supports indexing
 •   Map/Reduce


                                         26
CouchDB

 •   http://couchdb.apache.org/
 •   RESTful API
 •   JSON documents
 •   Written in Erlang
 •   Supports ACID
 •   Map/Reduce
 •   Eventual consistency

                                  27
Graph


 •   Provide ways to store graphs
 •   Provide traversing
 •   Graph oriented functionality




                                    28
Neo4j


 •   http://neo4j.org/
 •   Written in Java
 •   Stores and navigates graphs
 •   Stable and proven
 •   Commercial and free licenses




                                    29
PROS AND CONS OF USAGE


                         30
Pros and Cons


 •   Scalability
 •   Transactional Integrity and Consistency
 •   Data Modeling
 •   Query Support
 •   Access and Interface Availability




                                               31
Typical Usage

 •   Large amount of data
 •   Read/Write balanced?
 •   Read Heavy
 •   Write Heavy
 •   Scan
 •   Geospatial
 •   Map/Reduce
 •   Social data
                            32
Is it for you?


  •   Technology is still developing
  •   Be ready to patch
  •   SQL is easier
  •   Not all startups will end up being Facebooks
  •   Some things can be solvable only with
      NoSQL



                                                     33
TRENDS


         34
Trends
 •   Oracle released Oracle NoSQL!
 •   Adoption of Hadoop soars
 •   SQL like access to NoSQL stores taking form
     – UnSQL -
     http://www.unqlspec.org/display/UnQL/Home
 •   You can participate!




                                               35
Opportunities


 •   Spring Data -
     http://www.springsource.org/spring-data
 •   Cloud Foundry PaaS -
     http://www.cloudfoundry.com/
 •   ORM/Simplification




                                               36
Q/A




      37

Más contenido relacionado

La actualidad más candente

MySQL Storage Engines
MySQL Storage EnginesMySQL Storage Engines
MySQL Storage Engines
Karthik .P.R
 
No sql landscape_nosqltips
No sql landscape_nosqltipsNo sql landscape_nosqltips
No sql landscape_nosqltips
imarcticblue
 
Overview of no sql
Overview of no sqlOverview of no sql
Overview of no sql
Sean Murphy
 

La actualidad más candente (19)

Demystfying nosql databases
Demystfying nosql databasesDemystfying nosql databases
Demystfying nosql databases
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
 
Mongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirniMongo db groundup-0-nosql-intro-syedawasekhirni
Mongo db groundup-0-nosql-intro-syedawasekhirni
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documents
 
Conhecendo o Apache HBase
Conhecendo o Apache HBaseConhecendo o Apache HBase
Conhecendo o Apache HBase
 
Capacity planning for your data stores
Capacity planning for your data storesCapacity planning for your data stores
Capacity planning for your data stores
 
MySQL Storage Engines
MySQL Storage EnginesMySQL Storage Engines
MySQL Storage Engines
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
No sql landscape_nosqltips
No sql landscape_nosqltipsNo sql landscape_nosqltips
No sql landscape_nosqltips
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud
 
Supercharge your RDBMS with Elasticsearch
Supercharge your RDBMS with ElasticsearchSupercharge your RDBMS with Elasticsearch
Supercharge your RDBMS with Elasticsearch
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
Introduction to CosmosDB - Azure Bootcamp 2018
Introduction to CosmosDB - Azure Bootcamp 2018Introduction to CosmosDB - Azure Bootcamp 2018
Introduction to CosmosDB - Azure Bootcamp 2018
 
Overview of no sql
Overview of no sqlOverview of no sql
Overview of no sql
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 

Similar a Lviv EDGE 2 - NoSQL

Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
Don Demcsak
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
Don Demcsak
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
William LaForest
 

Similar a Lviv EDGE 2 - NoSQL (20)

No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Drop acid
Drop acidDrop acid
Drop acid
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
 
NoSQL-Overview
NoSQL-OverviewNoSQL-Overview
NoSQL-Overview
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Hadoop DB
Hadoop DBHadoop DB
Hadoop DB
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jc
 
NoSQL
NoSQLNoSQL
NoSQL
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
MongoDB SF Python
MongoDB SF PythonMongoDB SF Python
MongoDB SF Python
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 

Más de zenyk

SEMASEARCH - Високі технології у боротьбі з корупцією та на захисті держави
SEMASEARCH - Високі технології у боротьбі з корупцією та на захисті державиSEMASEARCH - Високі технології у боротьбі з корупцією та на захисті держави
SEMASEARCH - Високі технології у боротьбі з корупцією та на захисті держави
zenyk
 

Más de zenyk (13)

Semasearch Spring - 2015
Semasearch   Spring - 2015Semasearch   Spring - 2015
Semasearch Spring - 2015
 
Проект Каскад
Проект КаскадПроект Каскад
Проект Каскад
 
Ecois.me and uMuni
Ecois.me and uMuniEcois.me and uMuni
Ecois.me and uMuni
 
Semasearch Intro
Semasearch IntroSemasearch Intro
Semasearch Intro
 
Rapid Development of Big Data applications using Spring for Apache Hadoop
Rapid Development of Big Data applications using Spring for Apache HadoopRapid Development of Big Data applications using Spring for Apache Hadoop
Rapid Development of Big Data applications using Spring for Apache Hadoop
 
SEMASEARCH - Високі технології у боротьбі з корупцією та на захисті держави
SEMASEARCH - Високі технології у боротьбі з корупцією та на захисті державиSEMASEARCH - Високі технології у боротьбі з корупцією та на захисті держави
SEMASEARCH - Високі технології у боротьбі з корупцією та на захисті держави
 
Introduction to Clojure - EDGE Lviv
Introduction to Clojure - EDGE LvivIntroduction to Clojure - EDGE Lviv
Introduction to Clojure - EDGE Lviv
 
Puppet / DevOps - EDGE Lviv
Puppet / DevOps - EDGE LvivPuppet / DevOps - EDGE Lviv
Puppet / DevOps - EDGE Lviv
 
Spring for Apache Hadoop
Spring for Apache HadoopSpring for Apache Hadoop
Spring for Apache Hadoop
 
Hadoop Solutions
Hadoop SolutionsHadoop Solutions
Hadoop Solutions
 
Emotional Intelligence
Emotional IntelligenceEmotional Intelligence
Emotional Intelligence
 
Amazon Clouds in Action
Amazon Clouds in ActionAmazon Clouds in Action
Amazon Clouds in Action
 
Modern Java Web Development
Modern Java Web DevelopmentModern Java Web Development
Modern Java Web Development
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Lviv EDGE 2 - NoSQL

  • 1. NoSQL By Zenyk Matchyshyn Staff Engineer, Lohika 1
  • 2. Agenda • History • Architecture vs Technology • Classification • Pros and Cons of usage • Trends • Q/A 2
  • 4. 4
  • 5. History • NoSQL Technologies are not new • Many ideas originate from distributed computing, grid computing and parallel computing • Main drivers: • Scalability • Parallelization • Costs 5
  • 6. Google • In the beginning… there was Google! • Google shared scientific papers: • “The Google File System”, October 2003 • “MapReduce: Simplified Data Processing on Large Clusters”, December 2004 • “Bigtable: A Distributed Storage System for Structured Data”, November 2006 • “The Chubby Lock Service for Loosely- Coupled Distributed Systems”, November 2006 6
  • 7. Amazon • … and Amazon! • “Dynamo: Amazon Highly Available key/value Store”, October 2007 7
  • 8. New technologies! • Creators of Lucene wanted to create a full search solution • Ended up with Hadoop and Hadoop Distributed File System (HDFS) • Success helped adoption and new solutions emerged 8
  • 10. Architecture vs Technology • SQL is not bad, it’s just different • You can use SQL DB in NoSQL way, e.g. MySQL as a key-value database • You can do SQL queries on Hadoop data 10
  • 11. Architecture • The way you store data • The way you query data • Technology environment 11
  • 13. Terms • ACID – Atomicity, Consistency, Isolation, Durability • CAP Theorem – Consistency, Availability, Partition tolerance • Eventual consistency • Hashing • Schema 13
  • 14. Classification • Column oriented stores • Key/Value stores • Key/Value stores with configurable consistency • Document stores • Graph stores 14
  • 15. Chart memcached Scalability & Performance Key/value Column oriented Document store RDBMS Depth of Functionality 15
  • 16. Column oriented • Based on Google Bigtable • Column oriented is a revers of Row oriented • Assumption is that datacenters are transcontinental and connected using standard Internet • C and P from CAP Theorem • Data consistent and partitioned but trouble with availability 16
  • 17. HBase • Spin off from Hadoop project - http://hbase.apache.org/ • Written in Java • A lot of interfaces – Thrift, REST, JRuby, etc. • SQL-like access through Hive - http://hive.apache.org/ • HBase ORM – Surus - https://github.com/mushkevych/surus • Used by Facebook, Hulu, Yahoo!, Ning, etc. 17
  • 18. Hypertable • Developed by Zvents, open sourced • Written in C++ • Running on top of distributed file system • Used by Baidu 18
  • 19. Key/Value • Key/Value Store – Oracle Berkley DB (Oracle NoSQL), Redis, Kyoto Cabinet • Can store strings, arrays, hashes 19
  • 20. Oracle NoSQL • Sign of things to come! • http://www.oracle.com/technetwork/database/ nosqldb/overview/index.html • Written in Java • Configurable consistency • BerkleyDB as a backend • No single node of failure • Transactions 20
  • 21. Redis • http://redis.io/ • Lots of bindings • Written in C • In-memory, with optional durability • Also a document store 21
  • 22. Key/Value – eventual consistency • K/V Availability over Consistency • Inspired by Amazon Dynamo • Dynamo based on assumption of high speed network links between data centers and datacenters are close to each other • A and P from CAP Theorem • Achieve eventual consistency through replication and verification • Consistency is eventual 22
  • 23. Cassandra • http://cassandra.apache.org/ • Multidimensional map indexed by key • No single point of failure • Decentralized • Tunable consistency • Used by Facebook, Cisco, IBM, Rackspace 23
  • 24. Voldemort • http://project-voldemort.com/ • Developed by LinkedIn • Written in Java • Developers oriented – a lot of modules are pluggable • Strictly key/value 24
  • 25. Document stores • Document Databases • Document oriented stores are semi structured • Mostly JSON oriented • Also called schema free rows • Can query by field 25
  • 26. MongoDB • http://www.mongodb.org/ • Schema-free, document-oriented • Written in C++ • Lots of interfaces • JSON documents • Query language, supports indexing • Map/Reduce 26
  • 27. CouchDB • http://couchdb.apache.org/ • RESTful API • JSON documents • Written in Erlang • Supports ACID • Map/Reduce • Eventual consistency 27
  • 28. Graph • Provide ways to store graphs • Provide traversing • Graph oriented functionality 28
  • 29. Neo4j • http://neo4j.org/ • Written in Java • Stores and navigates graphs • Stable and proven • Commercial and free licenses 29
  • 30. PROS AND CONS OF USAGE 30
  • 31. Pros and Cons • Scalability • Transactional Integrity and Consistency • Data Modeling • Query Support • Access and Interface Availability 31
  • 32. Typical Usage • Large amount of data • Read/Write balanced? • Read Heavy • Write Heavy • Scan • Geospatial • Map/Reduce • Social data 32
  • 33. Is it for you? • Technology is still developing • Be ready to patch • SQL is easier • Not all startups will end up being Facebooks • Some things can be solvable only with NoSQL 33
  • 34. TRENDS 34
  • 35. Trends • Oracle released Oracle NoSQL! • Adoption of Hadoop soars • SQL like access to NoSQL stores taking form – UnSQL - http://www.unqlspec.org/display/UnQL/Home • You can participate! 35
  • 36. Opportunities • Spring Data - http://www.springsource.org/spring-data • Cloud Foundry PaaS - http://www.cloudfoundry.com/ • ORM/Simplification 36
  • 37. Q/A 37