1. NoSQL workshop
Friso van Vollenhoven
fvanvollenhoven@xebia.com
Joris Bontje
Discover the diversity of jbontje@xebia.com
NoSQL
2. Why not only SQL?
RDBMS
RDBMS RDBMS
RDBMS
RDBMS not scaling out per
scaling up
built with scale machine licenses
has limits
out in mind is expensive
5. MongoDB
‣ Document-oriented database (schema free)
‣ Adhoc queries, querying nested fields, indexing
‣ Sharding and Replication
$ db.twitter-user.find({id: 14352528})
{
! name: “NLJUG, Dutch Java Us”
! statuses_count: 200
! description: ”NLJUG”
! location: ”Netherlands”
! screen_name: ”nljug”
! status: {
! ! id: 126973661537255420
! ! geo: null
! ! source: ”HootSuite”
! ! text: ”RT @ktukker: en het lijkt erop dat het kabeltje weer verbonden is. de
! ! @nljug server is weer bereikbaar. #xs4all #storing”
! ! created_at: ”Thu Oct 20 10:50:54 +0000 2011”
! }
! url: ”http://www.nljug.org/”
! friends_count: 31
! followers_count: 950
! id: 14352528
! created_at: ”Thu Apr 10 16:12:07 +0000 2008”
}
6. Neo4j
‣ Graph database
‣ Storing data as nodes and relationships in graphs
‣ High-speed node traversal / search
‣ Full transaction support
‣ Scalable for reads through replication
7. HBase
‣ Sparse column database
‣ Based on Google’s BigTable model (similar to Cassandra)
‣ Built for scalability
‣ Strong consistency at the row level
‣ Data model:
‣ row key => column family:column qualifier:column value
‣ columns are sparse
‣ Data is always byte[]
key / column family ‘DATA FAMILY’ ‘OTHER FAMILY’
[100, 101, 13, 8] [13,10]:[1,2,3,4,5] [18,10]:[1,9,7] [120,124]:[123]
[100, 101, 13, 10] [13,10]:[1,2,3,4,5] [20,9]:[1,9,7] [120,124]:[123]
[100, 101, 13, 22] [18,10]:[1,9,7] [120,124]:[123]
8. Riak
‣ Decentralized Key-value store
‣ Links, MapReduce, Secondary indexes
‣ No master node; no single point of failure
‣ Eventually consistent
GET /riak/twitter-user/14352528
Content-Type: application/json
{
! name: “NLJUG, Dutch Java Us”
! statuses_count: 200
! description: ”NLJUG”
! location: ”Netherlands”
! screen_name: ”nljug”
! status: {
! ! id: 126973661537255420
! ! geo: null
! ! source: ”HootSuite”
! ! text: ”RT @ktukker: en het lijkt erop dat het kabeltje weer verbonden is. de
! ! @nljug server is weer bereikbaar. #xs4all #storing”
! ! created_at: ”Thu Oct 20 10:50:54 +0000 2011”
! }
! url: ”http://www.nljug.org/”
! friends_count: 31
! followers_count: 950
! id: 14352528
! created_at: ”Thu Apr 10 16:12:07 +0000 2008”
}
9. Exercise
‣ Twitter data:
‣ Most @nljug followers and their most recent tweets as JSON
data
‣ Now create 4 groups:
‣ MongoDB: Age
‣ Neo4j: Ron
‣ HBase: Friso
‣ Riak: Joris
‣ Access Point: NOSQL password: jfall2011
‣ Data and source available on http://192.168.0.100:8080/
‣ Check README.md for installation instructions, exercises and
hints