Schemaless Databases

Computational Research Division Lawrence Berkeley National Laboratory Dan Gunter

Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Terminology: NOSQL and “Schemaless” ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

NOSQL past and present Pre-RDBMS RDBMS era NOSQL

Pre-relational structured storage systems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Computer Systems News , 11/28/83

The relational model ,[object Object],[object Object],[object Object],A 1 ... A n Value 1 ... Value n R Relation (Table) Relation variable (Table name) Attribute (Column) {unordered} Heading Tuple (Row) {unordered}

Recent NOSQL database products Columnar or Extensible record Google BigTable HBase Cassandra HyperTable SimpleDB Document Store CouchDB MongoDB Lotus Domino Graph DB Neo4j FlockDB InfiniteGraph Key/Value Store Mnesia Memcached Redis Tokyo Cabinet Dynamo Project Voldemort Dynomite Riak

Why NOSQL? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

CAP Theorem ,[object Object],[object Object],[object Object],[object Object],[object Object],All robust distributed systems live here Forfeit partition-tolerance Forfeit availability Forfeit consistency Single-site databases, cluster databases, LDAP Distributed databases w/pessimistic locking, majority protocols Coda, web caching, DNS, Dynamo

CAP, ACID, and BASE ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ACID BASE

Pioneers ,[object Object],[object Object],These implementations are not publicly available, but the distributed-system techniques that they integrated to build huge databases have been imitated, to a greater or lesser extent, by every implementation that followed.

Google BigTable ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

BigTable’s Data Model Google’s Bigtable is essentially a massive, distributed 3-D spreadsheet. It doesn’t do SQL, there is limited support for atomic transactions, nor does it support the full relational database model. In short, in these and other areas, the Google team made design trade-offs to enable the scalability and fault-tolerance Google apps require. - Robin Harris, StorageMojo (blog), 2006-09-08 t 6 t 5 t 3 name contents: anchor:cnnsi.com ... anchor:my.look.ca ... “ com.cnn.www” “ CNN” ... “ CNN.com” ... “ <html>...” “ <html>...” “ <html>...”

Tablets and SSTables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Use of Bloom Filters to optimize lookups ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],w is not in { x, y, z } because it hashes to one position with a 0 1 1 1 0 0 1 0 1 0 1 0 0 1 0 { x, w y, z }

Chubby and Paxos ,[object Object],Each “DB” is a replica Each server runs on its own host Google tends to run 5 servers, with only one being the “master” at any one time Chubby server DB Chubby server DB Chubby server DB Chubby server DB Chubby server DB Master

What about CAP? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Amazon’s Dynamo ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Dynamo data partitioning and replication Virtual node Host “node” Host “node” Virtual node Virtual node Virtual node Virtual node Virtual node Virtual node . . Hash ring using consistent hashing Host “node” Virtual node Virtual node Virtual node Virtual node 4 4 3 Item Hashes to this spot coordinator node replicas

Eventual consistency and sloppy quorum ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Replica synchronization with Merkle trees ,[object Object],[object Object],[object Object],For Dynamo, the “data” are the keys stored in a given virtual node Each node is a hash of its children If two top hashes match, then the trees are the same

Infrastructure (at scale) is fractal ,[object Object],[object Object],[object Object]

The Gold Rush Columnar or Extensible record Google BigTable HBase Cassandra HyperTable SimpleDB Document Store CouchDB MongoDB Lotus Domino Graph DB Neo4j FlockDB InfiniteGraph Key/Value Store Mnesia Memcached Redis Tokyo Cabinet Dynamo Project Voldemort Dynomite Riak Hibari

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Key/Value Store Memcached Redis Tokyo Cabinet Dynamo Project Voldemort Dynomite Riak Hibari

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Project Voldemort Type Key/Value Store License Apache 2.0 Language Java Company Linked-In Web project-voldemort.com

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],result = self.client.add(bucket.get_name()).map("Riak.mapValuesJson” .reduce("Riak.reduceSum”.run() Riak Example: Map/reduce with the Python API Type Key/Value Store License Open-Source Language Erlang Company Basho Web wiki.basho.com/display/RIAK/Riak/

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Hibari Type Key/Value Store License Open-Source Language Erlang Company Gemini Mobile Web sourceforge.net/projects/hibari/

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Columnar or Extensible record Google BigTable HBase Cassandra HyperTable

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Cassandra ,[object Object],Type Extensible column store License Apache 2.0 Language Java Company Apache Software Foundation Web cassandra.apache.org

[object Object],[object Object],[object Object],[object Object],[object Object],SimpleDB Document Store CouchDB MongoDB Lotus Domino Mnesia

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],CouchDB Type Document store License Apache 2.0 Language Erlang Company Apache Software Foundation Web couchdb.org

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],MongoDB ,[object Object],http://www.slideshare.net/mongodb/mongodb-replica-sets Type Document store License GPL Language C++ Company 10gen Web mongodb.org

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Mnesia * Mozilla Public License modified to conform with laws of Sweden (more herring) Type Document store License EPL* Language Erlang Company Ericsson Web www.erlang.org Papers http://www.erlang.se/publications/mnesia_overview.pdf

Why do we care about Mnesia / OTP? ,[object Object],[object Object],females() -> F = fun() -> Q = query [E.name || E <- table(employee), E.sex = female] end, mnemosyne:eval(Q) end, mnesia:transaction(F). Erlang query for “all females” in company* *I know, but it’s not my example. This is right out of the manual.

Comparison of MongoDB and CouchDB ,[object Object],[object Object],[object Object],[object Object],[object Object],Database Inserts/sec MongoDB 16,000 CouchDB 70 CouchDB, batch 1,800

Schemaless data modeling http://labs.mudynamics.com/2010/04/01/why-nosql-is-bad-for-startups/

Example from distributed monitoring ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],All of these are data modeling “anti-patterns” for relational DBs

What’s wrong with EAV? ,[object Object],[object Object]

SQL vs. M/R and other models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Selected references ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Schemaless Databases

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Schemaless Databases

Similar to Schemaless Databases (20)

Recently uploaded

Recently uploaded (20)

Schemaless Databases

Editor's Notes