SlideShare a Scribd company logo
1 of 42
Download to read offline
NoSQL / Spring Data

Polyglot Persistence – An introduction to Spring Data
Pronam Chatterjee
pronamc@vmware.com




                                                        © 2011 VMware Inc. All rights reserved
Presentation goal



    How Spring Data simplifies the
           development of NoSQL
                    applications




2
Agenda

•   Why NoSQL?
•   Overview of NoSQL databases
•   Introduction to Spring Data
•   Database APIs
      - MongoDB
      - HyperSQL
      - Neo4J




3
Relational databases are great

• SQL = Rich, declarative query language
• Database enforces referential integrity
• ACID semantics
• Well understood by developers
• Well supported by frameworks and tools, e.g. Spring JDBC, Hibernate, JPA
• Well understood by operations
 • Configuration
 • Care and feeding
 • Backups
 • Tuning
 • Failure and recovery
 • Performance characteristics
• But….




    4
The trouble with relational databases

• Object/relational impedance mismatch
 - Complicated to map rich domain model to relational schema
• Relational schema is rigid
 - Difficult to handle semi-structured data, e.g. varying attributes
 - Schema changes = downtime or $$
• Extremely difficult/impossible to scale writes:
 - Vertical scaling is limited/requires $$
 - Horizontal scaling is limited or requires $$
• Performance can be suboptimal for some use cases




  5
NoSQL databases have emerged…

Each one offers some combination of:
• High performance
• High scalability
• Rich data-model
• Schema less
In return for:
• Limited transactions
• Relaxed consistency
•…




 6
… but there are few commonalities

• Everyone and their dog has written one
• Different data models
 - Key-value
 - Column
 - Document
 - Graph
• Different APIs – No JDBC, Hibernate, JPA (generally)
• “Same sorry state as the database market in the 1970s before SQL was
    invented” http://queue.acm.org/detail.cfm?id=1961297




7
NoSQL databases have emerged…

    • NoSQL usage small by
      comparison…
    • But growing…




8
Agenda
• Why NoSQL?
• Overview of NoSQL databases
• Introduction to Spring Data
• Database APIs
       - MongoDB
       - HyperSQL
       - Neo4J




  10
Redis
• Advanced key-value store
 - Think memcached on steroids (the good kind)
 - Values can be binary strings, Lists, Sets, Ordered Sets, Hash maps, ..
 - Operations for each data type, e.g. appending to a list, adding to a
   set, retrieving a slice of a list, …
 - Provides pub/sub-based messaging                                         K1   V1

• Very fast:                                                                K2   V2
 - In-memory operations
 - ~100K operations/second on entry-level hardware                          K3   V2

• Persistent
 - Periodic snapshots of memory OR append commands to log file
 - Limits are size of keys retained in memory.
• Has “transactions”
 - Commands can be batched and executed atomically




  11
Scaling Redis

• Master/slave replication
 - Tree of Redis servers
 - Non-persistent master can replicate to a persistent slave
 - Use slaves for read-only queries
• Sharding
 - Client-side only – consistent hashing based on key
 - Server-side sharding – coming one day
• Run multiple servers per physical host
 - Server is single threaded => Leverage multiple CPUs
 - 32 bit more efficient than 64 bit
• Optional "virtual memory"
 - Ideally data should fit in RAM
 - Values (not keys) written to disc




 13
Redis use cases
• Use in conjunction with another database as the SOR
• Drop-in replacement for Memcached
  - Session state
  - Cache of data retrieved from SOR
  - Denormalized datastore for high-performance queries
• Hit counts using INCR command
• Randomly selecting an item – SRANDMEMBER
• Queuing – Lists with LPOP, RPUSH, ….
• High score tables – Sorted sets

Notable users: github, guardian.co.uk, ….




  14
vFabric Gemfire - Elastic data fabric
• High performance data grid
• Enhanced parallel disk persistence
• Non Disruptive up/down scalability
• Session state
  - Cache of data retrieved from SOR
  - Denormalized datastore for high-performance queries
• Heterogenous data sharing
  • Java
  • .net
  • C++
• Co-located Transactions




    14
Gemfire - Use Cases

 • Ultra low latency high throughput application
 • As an L2 cache in hibernate
 • Distributed Batch process
 • Session state
   - Tomcat
   - tcServer
 • Wide Area replication




     14
Neo4j

 •Graph data model
  - Collection of graph nodes
  - Typed relationships between nodes
  - Nodes and relationships have properties
 •High performance traversal API from roots
  - Breadth first/depth first
 •Query to find root nodes
  - Indexes on node/relationship properties
  - Pluggable - Lucene is the default
 •Graph algorithms: shortest path, …
 •Transactional (ACID) including 2PC
 •Deployment modes
  - Embedded – written in Java
  - Server with REST API


  15
Neo4j Data Model




  16
Neo4j Use Cases

 • Use Cases
  -    Anything social
  -    Cloud/Network management, i.e. tracking/managing physical/virtual resources
  -    Any kind of geospatial data
  -    Master data management
  -    Bioinformatics
  -    Fraud detection
  -    Metadata management
 • Who is using it?
  -    StudiVZ (the largest social network in Europe)
  -    Fanbox
  -    The Swedish military
  -    And big organizations in datacom, intelligence, and finance that wish to remain anonymous




  19
MongoDB

• Document-oriented database
  - JSON-style documents: Lists, Maps, primitives
  - Documents organized into collections (~table)
• Full or partial document updates
  - Transactional update in place on one document
  - Atomic Modifiers
• Rich query language for dynamic queries
• Index support – secondary and compound
• GridFS for efficiently storing large files
• Map/Reduce




  20
Data Model = Binary JSON documents

 {

          "name" : "Ajanta",
                                                                       One document
          "type" : "Indian",
                                                                             =
          "serviceArea" : [
               "94619",                                             one DDD aggregate
               "94618"
          ],

          "openingHours" : [
               {
                                                         • Sequence of bytes on disk = fast I/O
                                                          - No joins/seeks
                   "dayOfWeek" : Monday,

                   "open" : 1730,
                                                          - In-place updates when possible => no index updates
                   "close" : 2130                        • Transaction = update of single document
               }
          ],

          "_id" : ObjectId("4bddc2f49d1505567c6220a0")
 }




     21
MongoDB query by example

 • Find a restaurant that serves the 94619 zip code and is open at 6pm on a Monday

  {
       serviceArea:"94619",
       openingHours: {
           $elemMatch :    {
                  "dayOfWeek" : "Monday",
                  "open": {$lte: 1800},
                  "close": {$gte: 1800}
           }
       }
  }                      DBCursor cursor = collection.find(qbeObject);
                         while (cursor.hasNext()) {
                               DBObject o = cursor.next();
                               …
                           }




  23
MongoDB use cases

 •                                                Use cases
     -    Real-time analytics
     -    Content management systems
     -    Single document partial update
     -    Caching
     -    High volume writes
 •                                                Who is using it?
     -    Shutterfly, Foursquare
     -    Bit.ly Intuit
     -    SourceForge, NY Times
     -    GILT Groupe, Evite,
     -    SugarCRM




 Copyright (c) 2011 Chris Richardson. All rights reserved.


     25
Other NoSQL databases

• SimpleDB – “key-value”
• Cassandra – column oriented database
• CouchDB – document-oriented
• Membase – key-value
• Riak – key-value + links
• Hbase – column-oriented…




      http://nosql-database.org/ has a list of 122 NoSQL databases



 26
Agenda

 • Why NoSQL?
 • Overview of NoSQL databases
 • Introduction to Spring Data
 • Database APIs
       - MongoDB
       - HyperSQL
       - Neo4J




  27
NoSQL Java APIs

Database                  Libraries
Redis                     Jedis, JRedis, JDBC-Redis, RJC

Neo4j                     Vendor-provided
MongoDB                   Vendor-provided Java driver
Gemfire                   Pure Java map API, Spring-Gemfire templates

But
• Usage patterns
• Tedious configuration
• Repetitive code
• Error prone code
•…




  28
Spring Data Project Goals

 • Bring classic Spring value propositions to a wide range of NoSQL databases:
  - Productivity
  - Programming model consistency: E.g. <NoSQL>Template classes
  - “Portability”




  30
Spring Data sub-projects

 •   Commons: Polyglot persistence
 •   Key-Value: Redis, Riak
 •   Document: MongoDB, CouchDB
 •   Graph: Neo4j
 •   GORM for NoSQL



                                 http://www.springsource.org/spring-data




31
Many entry points to use

 • Auto-generated repository implementations
 • Opinionated APIs (Think JdbcTemplate)
 • Object Mapping (Java and GORM)
 • Cross Store Persistence Programming model
 • Productivity support in Roo and Grails




  32
Cloud Foundry supports NoSQL




 MongoDB and Redis are provided as services
 è Deploy your MongoDB and Redis applications in seconds




33
Agenda

• Why NoSQL?
• Overview of NoSQL databases
• Introduction to Spring Data
• Database APIs
      - MongoDB
      - HyperSQL
      - Neo4J




 34
Three databases for today’s talk


        Document database


         Relational database


           Graph database




35
Three persistence strategies for today’s talk

• Lower level template approach
• Conventions based persistence (Hades)
• Cross-Store persistence using JPA and a NoSQL datastore




  36
Spring Template Patterns

• Resource Management
• Callback methods
• Exception Translation
• Simple Query API




 37
Repository Implementation




38
• Also known as HSQLDB or Hypersonic SQL
• Relational Database
• Table oriented data model
• SQL used for for queries
• … you know the rest…




 39
Spring Data Repository Support

• Eliminate bolierplate code – only finder methods
• findByLastName – Specifications for type safe queries
• JPA CrietriaBuilder integration QueryDSL




40
• Type safe queries for multiple backends including JPA, SQL and MongoDB in Java
• Generate Query classes using Java APT
• Code completion in IDE
• Domain types and properties can be referenced safely
• Adopts better to refactoring changes in domain types



http://www.querydsl.com




 41
QueryDSL




 • Repository Support
 • Spring Data JPA
 • Spring data Mongo
 • Spring Data JDBC extensions
 • QueryDslJdbcTemplate




  42
Spring Data Neo4J

•    Using AspectJ support providing a new programming model
•    Use annotations to define POJO entities
•    Constructor advice automatically handles entity creation
•    Entity field state persisted to graph using aspects
•    Leverage graph database APIs from POJO model
•    Annotation-driven indexing of entities for search




    43
Spring Data Graph Neo4J cross-store

• JPA data and “NOSQL” data can share a data model
• Separate the persistence provider by using annotations
– could be the entire Entity
– or, some of the fields of an Entity
• We call this cross-store persistence
– One transaction manager to coordinate the “NOSQL” store with the JPA relational database
– AspectJ support to manage the “NOSQL” entities and fields
• holds on to changed values in “change sets” until the transaction commits for non-
  transactional data stores




  44
A cross-store scenario ...


     You have a traditional web app using JPA to persist data to a relational
     database ...




45
JPA Data Model




46

      8/3/11     Slide 46
Cross-Store Data Model




47

More Related Content

What's hot

Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...rhatr
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Gavin Heavyside
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDBMongoDB
 
3 scenarios when to use MongoDB!
3 scenarios when to use MongoDB!3 scenarios when to use MongoDB!
3 scenarios when to use MongoDB!Edureka!
 
Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL databaseTobias Lindaaker
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullyMd Kamaruzzaman
 
Python and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James BlackburnPython and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James BlackburnPyData
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo dbRohit Bishnoi
 
Mongo db admin_20110329
Mongo db admin_20110329Mongo db admin_20110329
Mongo db admin_20110329radiocats
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity PlanningNorberto Leite
 

What's hot (20)

Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
 
3 scenarios when to use MongoDB!
3 scenarios when to use MongoDB!3 scenarios when to use MongoDB!
3 scenarios when to use MongoDB!
 
Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL database
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
Python and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James BlackburnPython and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James Blackburn
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo db
 
Mongo db admin_20110329
Mongo db admin_20110329Mongo db admin_20110329
Mongo db admin_20110329
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 

Viewers also liked

Oefening3 h7 goede presentatie
Oefening3 h7 goede presentatieOefening3 h7 goede presentatie
Oefening3 h7 goede presentatieroycotje
 
Should Australian Mothers be discourage from Homebirthing?
Should Australian Mothers be discourage from Homebirthing?Should Australian Mothers be discourage from Homebirthing?
Should Australian Mothers be discourage from Homebirthing?Jonouella Estares
 
深圳市城市设计促进中心 2011年度年报
深圳市城市设计促进中心 2011年度年报深圳市城市设计促进中心 2011年度年报
深圳市城市设计促进中心 2011年度年报yyy777
 
Catalyzing Innovation for All, By All: Innovation for Inclusive Development
Catalyzing Innovation for All, By All: Innovation for Inclusive DevelopmentCatalyzing Innovation for All, By All: Innovation for Inclusive Development
Catalyzing Innovation for All, By All: Innovation for Inclusive DevelopmentGrace Santos
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentationMurat Çakal
 

Viewers also liked (7)

Coaching Comercial
Coaching ComercialCoaching Comercial
Coaching Comercial
 
Oefening3 h7 goede presentatie
Oefening3 h7 goede presentatieOefening3 h7 goede presentatie
Oefening3 h7 goede presentatie
 
Should Australian Mothers be discourage from Homebirthing?
Should Australian Mothers be discourage from Homebirthing?Should Australian Mothers be discourage from Homebirthing?
Should Australian Mothers be discourage from Homebirthing?
 
深圳市城市设计促进中心 2011年度年报
深圳市城市设计促进中心 2011年度年报深圳市城市设计促进中心 2011年度年报
深圳市城市设计促进中心 2011年度年报
 
Catalyzing Innovation for All, By All: Innovation for Inclusive Development
Catalyzing Innovation for All, By All: Innovation for Inclusive DevelopmentCatalyzing Innovation for All, By All: Innovation for Inclusive Development
Catalyzing Innovation for All, By All: Innovation for Inclusive Development
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
Windenergie
WindenergieWindenergie
Windenergie
 

Similar to Wmware NoSQL

Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Chris Richardson
 
The Evolution of Open Source Databases
The Evolution of Open Source DatabasesThe Evolution of Open Source Databases
The Evolution of Open Source DatabasesIvan Zoratti
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewAntonio Pintus
 
Meetup#2: Building responsive Symbology & Suggest WebService
Meetup#2: Building responsive Symbology & Suggest WebServiceMeetup#2: Building responsive Symbology & Suggest WebService
Meetup#2: Building responsive Symbology & Suggest WebServiceMinsk MongoDB User Group
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Tim Lossen
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBRadenko Zec
 

Similar to Wmware NoSQL (20)

Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
Drop acid
Drop acidDrop acid
Drop acid
 
The Evolution of Open Source Databases
The Evolution of Open Source DatabasesThe Evolution of Open Source Databases
The Evolution of Open Source Databases
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overview
 
Meetup#2: Building responsive Symbology & Suggest WebService
Meetup#2: Building responsive Symbology & Suggest WebServiceMeetup#2: Building responsive Symbology & Suggest WebService
Meetup#2: Building responsive Symbology & Suggest WebService
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 
Big data on aws
Big data on awsBig data on aws
Big data on aws
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 

More from Murat Çakal

Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Mongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMurat Çakal
 
Trouble with nosql_dbs
Trouble with nosql_dbsTrouble with nosql_dbs
Trouble with nosql_dbsMurat Çakal
 

More from Murat Çakal (8)

REST vs. SOAP
REST vs. SOAPREST vs. SOAP
REST vs. SOAP
 
Cassandra NoSQL
Cassandra NoSQLCassandra NoSQL
Cassandra NoSQL
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Mongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMongodb open source_high_performance_database
Mongodb open source_high_performance_database
 
Trouble with nosql_dbs
Trouble with nosql_dbsTrouble with nosql_dbs
Trouble with nosql_dbs
 
NoSql databases
NoSql databasesNoSql databases
NoSql databases
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
No sql
No sqlNo sql
No sql
 

Recently uploaded

Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 

Recently uploaded (20)

Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 

Wmware NoSQL

  • 1. NoSQL / Spring Data Polyglot Persistence – An introduction to Spring Data Pronam Chatterjee pronamc@vmware.com © 2011 VMware Inc. All rights reserved
  • 2. Presentation goal How Spring Data simplifies the development of NoSQL applications 2
  • 3. Agenda • Why NoSQL? • Overview of NoSQL databases • Introduction to Spring Data • Database APIs - MongoDB - HyperSQL - Neo4J 3
  • 4. Relational databases are great • SQL = Rich, declarative query language • Database enforces referential integrity • ACID semantics • Well understood by developers • Well supported by frameworks and tools, e.g. Spring JDBC, Hibernate, JPA • Well understood by operations • Configuration • Care and feeding • Backups • Tuning • Failure and recovery • Performance characteristics • But…. 4
  • 5. The trouble with relational databases • Object/relational impedance mismatch - Complicated to map rich domain model to relational schema • Relational schema is rigid - Difficult to handle semi-structured data, e.g. varying attributes - Schema changes = downtime or $$ • Extremely difficult/impossible to scale writes: - Vertical scaling is limited/requires $$ - Horizontal scaling is limited or requires $$ • Performance can be suboptimal for some use cases 5
  • 6. NoSQL databases have emerged… Each one offers some combination of: • High performance • High scalability • Rich data-model • Schema less In return for: • Limited transactions • Relaxed consistency •… 6
  • 7. … but there are few commonalities • Everyone and their dog has written one • Different data models - Key-value - Column - Document - Graph • Different APIs – No JDBC, Hibernate, JPA (generally) • “Same sorry state as the database market in the 1970s before SQL was invented” http://queue.acm.org/detail.cfm?id=1961297 7
  • 8. NoSQL databases have emerged… • NoSQL usage small by comparison… • But growing… 8
  • 9. Agenda • Why NoSQL? • Overview of NoSQL databases • Introduction to Spring Data • Database APIs - MongoDB - HyperSQL - Neo4J 10
  • 10. Redis • Advanced key-value store - Think memcached on steroids (the good kind) - Values can be binary strings, Lists, Sets, Ordered Sets, Hash maps, .. - Operations for each data type, e.g. appending to a list, adding to a set, retrieving a slice of a list, … - Provides pub/sub-based messaging K1 V1 • Very fast: K2 V2 - In-memory operations - ~100K operations/second on entry-level hardware K3 V2 • Persistent - Periodic snapshots of memory OR append commands to log file - Limits are size of keys retained in memory. • Has “transactions” - Commands can be batched and executed atomically 11
  • 11. Scaling Redis • Master/slave replication - Tree of Redis servers - Non-persistent master can replicate to a persistent slave - Use slaves for read-only queries • Sharding - Client-side only – consistent hashing based on key - Server-side sharding – coming one day • Run multiple servers per physical host - Server is single threaded => Leverage multiple CPUs - 32 bit more efficient than 64 bit • Optional "virtual memory" - Ideally data should fit in RAM - Values (not keys) written to disc 13
  • 12. Redis use cases • Use in conjunction with another database as the SOR • Drop-in replacement for Memcached - Session state - Cache of data retrieved from SOR - Denormalized datastore for high-performance queries • Hit counts using INCR command • Randomly selecting an item – SRANDMEMBER • Queuing – Lists with LPOP, RPUSH, …. • High score tables – Sorted sets Notable users: github, guardian.co.uk, …. 14
  • 13. vFabric Gemfire - Elastic data fabric • High performance data grid • Enhanced parallel disk persistence • Non Disruptive up/down scalability • Session state - Cache of data retrieved from SOR - Denormalized datastore for high-performance queries • Heterogenous data sharing • Java • .net • C++ • Co-located Transactions 14
  • 14. Gemfire - Use Cases • Ultra low latency high throughput application • As an L2 cache in hibernate • Distributed Batch process • Session state - Tomcat - tcServer • Wide Area replication 14
  • 15. Neo4j •Graph data model - Collection of graph nodes - Typed relationships between nodes - Nodes and relationships have properties •High performance traversal API from roots - Breadth first/depth first •Query to find root nodes - Indexes on node/relationship properties - Pluggable - Lucene is the default •Graph algorithms: shortest path, … •Transactional (ACID) including 2PC •Deployment modes - Embedded – written in Java - Server with REST API 15
  • 17. Neo4j Use Cases • Use Cases - Anything social - Cloud/Network management, i.e. tracking/managing physical/virtual resources - Any kind of geospatial data - Master data management - Bioinformatics - Fraud detection - Metadata management • Who is using it? - StudiVZ (the largest social network in Europe) - Fanbox - The Swedish military - And big organizations in datacom, intelligence, and finance that wish to remain anonymous 19
  • 18. MongoDB • Document-oriented database - JSON-style documents: Lists, Maps, primitives - Documents organized into collections (~table) • Full or partial document updates - Transactional update in place on one document - Atomic Modifiers • Rich query language for dynamic queries • Index support – secondary and compound • GridFS for efficiently storing large files • Map/Reduce 20
  • 19. Data Model = Binary JSON documents { "name" : "Ajanta", One document "type" : "Indian", = "serviceArea" : [ "94619", one DDD aggregate "94618" ], "openingHours" : [ { • Sequence of bytes on disk = fast I/O - No joins/seeks "dayOfWeek" : Monday, "open" : 1730, - In-place updates when possible => no index updates "close" : 2130 • Transaction = update of single document } ], "_id" : ObjectId("4bddc2f49d1505567c6220a0") } 21
  • 20. MongoDB query by example • Find a restaurant that serves the 94619 zip code and is open at 6pm on a Monday { serviceArea:"94619", openingHours: { $elemMatch : { "dayOfWeek" : "Monday", "open": {$lte: 1800}, "close": {$gte: 1800} } } } DBCursor cursor = collection.find(qbeObject); while (cursor.hasNext()) { DBObject o = cursor.next(); … } 23
  • 21. MongoDB use cases • Use cases - Real-time analytics - Content management systems - Single document partial update - Caching - High volume writes • Who is using it? - Shutterfly, Foursquare - Bit.ly Intuit - SourceForge, NY Times - GILT Groupe, Evite, - SugarCRM Copyright (c) 2011 Chris Richardson. All rights reserved. 25
  • 22. Other NoSQL databases • SimpleDB – “key-value” • Cassandra – column oriented database • CouchDB – document-oriented • Membase – key-value • Riak – key-value + links • Hbase – column-oriented… http://nosql-database.org/ has a list of 122 NoSQL databases 26
  • 23. Agenda • Why NoSQL? • Overview of NoSQL databases • Introduction to Spring Data • Database APIs - MongoDB - HyperSQL - Neo4J 27
  • 24. NoSQL Java APIs Database Libraries Redis Jedis, JRedis, JDBC-Redis, RJC Neo4j Vendor-provided MongoDB Vendor-provided Java driver Gemfire Pure Java map API, Spring-Gemfire templates But • Usage patterns • Tedious configuration • Repetitive code • Error prone code •… 28
  • 25. Spring Data Project Goals • Bring classic Spring value propositions to a wide range of NoSQL databases: - Productivity - Programming model consistency: E.g. <NoSQL>Template classes - “Portability” 30
  • 26. Spring Data sub-projects • Commons: Polyglot persistence • Key-Value: Redis, Riak • Document: MongoDB, CouchDB • Graph: Neo4j • GORM for NoSQL http://www.springsource.org/spring-data 31
  • 27. Many entry points to use • Auto-generated repository implementations • Opinionated APIs (Think JdbcTemplate) • Object Mapping (Java and GORM) • Cross Store Persistence Programming model • Productivity support in Roo and Grails 32
  • 28. Cloud Foundry supports NoSQL MongoDB and Redis are provided as services è Deploy your MongoDB and Redis applications in seconds 33
  • 29. Agenda • Why NoSQL? • Overview of NoSQL databases • Introduction to Spring Data • Database APIs - MongoDB - HyperSQL - Neo4J 34
  • 30. Three databases for today’s talk Document database Relational database Graph database 35
  • 31. Three persistence strategies for today’s talk • Lower level template approach • Conventions based persistence (Hades) • Cross-Store persistence using JPA and a NoSQL datastore 36
  • 32. Spring Template Patterns • Resource Management • Callback methods • Exception Translation • Simple Query API 37
  • 34. • Also known as HSQLDB or Hypersonic SQL • Relational Database • Table oriented data model • SQL used for for queries • … you know the rest… 39
  • 35. Spring Data Repository Support • Eliminate bolierplate code – only finder methods • findByLastName – Specifications for type safe queries • JPA CrietriaBuilder integration QueryDSL 40
  • 36. • Type safe queries for multiple backends including JPA, SQL and MongoDB in Java • Generate Query classes using Java APT • Code completion in IDE • Domain types and properties can be referenced safely • Adopts better to refactoring changes in domain types http://www.querydsl.com 41
  • 37. QueryDSL • Repository Support • Spring Data JPA • Spring data Mongo • Spring Data JDBC extensions • QueryDslJdbcTemplate 42
  • 38. Spring Data Neo4J • Using AspectJ support providing a new programming model • Use annotations to define POJO entities • Constructor advice automatically handles entity creation • Entity field state persisted to graph using aspects • Leverage graph database APIs from POJO model • Annotation-driven indexing of entities for search 43
  • 39. Spring Data Graph Neo4J cross-store • JPA data and “NOSQL” data can share a data model • Separate the persistence provider by using annotations – could be the entire Entity – or, some of the fields of an Entity • We call this cross-store persistence – One transaction manager to coordinate the “NOSQL” store with the JPA relational database – AspectJ support to manage the “NOSQL” entities and fields • holds on to changed values in “change sets” until the transaction commits for non- transactional data stores 44
  • 40. A cross-store scenario ... You have a traditional web app using JPA to persist data to a relational database ... 45
  • 41. JPA Data Model 46 8/3/11 Slide 46