More Related Content Similar to NoSQL and CouchDB Similar to NoSQL and CouchDB (20) NoSQL and CouchDB2. Who am I ?
-> My Name: João Cerdeira
-> Team Leader
-> An Agile enthusiast:
Scrum / Kanban / Lean
-> A true believer in
OpenSource
http://twitter.com/jacerdeira cerdeira@gmail.com
3. Disclamer
-> I understand your questions, but
sometimes I don't have answers
-> I'm not a NoSQL Dogmatic, just an
enthusiast about the new
ways of storing information
-> I have worked with
RDBMS for 12 years
5. I don't care if I/you will
use SQL or NoSQL. I
just want to deliver
better
Services/Aplications to
the clients/users.
10. Choose only 2:
C onsistency
A vailability
P artition Tolerance
At a given time in certain enviroment
11. Consistency
MS
B
RD
Availability
N
oS
Q
L
Partition
Tolerance
12. Centralized System
In a centralized system (RDBMS) we
don't have network partition
P in CAP
So we get:
A vailability
C onsistency
14. Distr ibuted System
In a distr ibuted system we (might) have
network partition
P in CAP
So you can only pick one:
A vailability
C onsistency
15. CAP in practice
We have only two types of Systems
CP == CA (very similar)
AP
So in a network partition we have only
one choice
C onsistency
A vailability
16. -> B asically A vailable
-> S oft state
-> E ventually consistent
18. How to
Scale Out
RDBMS ?
http://capellaniaprimaria.blogspot.com/2011/02/concurso-deportivo-4-pregunta.html
25. Let Validate
our Thoughts
Do we need ACID for all solutions?
26. Let Validate
our Thoughts
Do we need ACID for all solutions?
When is Eventually Consistent enough ?
27. Let Validate
our Thoughts
Do we need ACID for all solutions?
When is Eventually Consistent enough ?
Different solutions require different needs
29. New Dr ivers Behind NoSQL
Large amount of data
Commodity hardware
Scale Fast And Cheap
Constantly changing request (data)
32. Why RDBMS aren't good
enough ?
Scalling reads in a RDBMS is hard
Scalling wr ites is impossible
35. Think again
Do we really need a RDBMS ?
Sometimes !
But a lot of times we
don't !
37. How did NoSQL start ?
Google: Bigtable
Amazon: Dynamo
Facebook: Cassandra
LinkedIn: Valdemort
Yahoo: HBase (hadoop)
38. Or igins
Google : “How can we build a DB on top of Google File
System”
Paper: Bigtable → A distributed store system for
structured data, 2006
Amazon: “How can we build a distributed hash table for
the data center”
Paper : Dynamo → Amazon's highly available key-value
store
39. Different Types of NoSQL
Key-Value Stores
Document Databases
Column Databases
Graph Databases
40. Key-Value Stores
Or igin: Amazon's Dynamo paper
Data model: Collections of KV pairs
Implementations: Dynamo, Voldemort, Membase,
Riak, Redis
Good For:
- Large amount of data
- Scale writes and reads
- Fast
- Programmer friendly
41. Document Databases
Or igin: Lotus Notes
Data model: Collections of Documents
Implementations: CouchDB, MongoDB,
Amazon SimpleDB
Good For:
- Human Data Structure
- Programmer friendly
- Rapid Development
- Web friendly
- CRUD
42. Column Databases
Or igin: Google's BigTable Paper
Data model: Column family – each row (at least in
theory) can have different configuration
Implementations: BigTable, HBase, Cassandra
Good For:
- Large amount of data
- scale writes like no other
- High availability
43. Graph Databases
Or igin: Graph Theory
Data model: Nodes and Relations,
both can have KV pairs
Implementations: Neo4j, FlockDB
Good For:
- resolve graph problems
- Fast
45. Why I'd choose CouchDB ?
-> Easy to understand documents
-> Use standards web technologies
-> Simple to install and configure
-> Small footprint (works on mobile platforms)
-> Scales well (not for huge amount of data)
-> Replication in the core
46. CouchDB Main Pr incipals
Document Oriented Database
No rows or columns
Collection of JSON Documents
Schema-Free
47. In CouchDB HTTP Rules
-> Everything is a HTTP Request
-> We are used to know GET and POST
-> But there are others:
-> PUT
-> DELETE
-> COPY
RESTful HTTP API
48. Why JSON ?
-> Light and text-based data format
-> Simple to parse
-> Not verbose (comparing to xml)
-> Suitable for javascript frameworks (jquery)
-> Parsers available in almost all
programming languages
49. JSON Example
{
make: "Ford",
model: "Mustang",
year: 2009,
body: "Coupe",
color: "Red",
engine: {
gas_type: "Petrol",
cubic_capacity: 4600
},
previous_owners: [
{
name: "John Smith",
mileage: 1000
},
{
name: "Jane Hunt",
mileage: 2500
}
]
}
50. JSON Example
{
make: "Ford",
model: "Mustang",
year: 2009,
body: "Coupe",
color: "Red",
engine: {
gas_type: "Petrol",
cubic_capacity: 4600
},
previous_owners: [
{
name: "John Smith",
mileage: 1000
},
{
name: "Jane Hunt",
mileage: 2500
}
]
}
51. JSON Example
{
make: "Ford",
model: "Mustang",
year: 2009,
body: "Coupe",
color: "Red",
engine: {
gas_type: "Petrol",
cubic_capacity: 4600
},
previous_owners: [
{
name: "John Smith",
mileage: 1000
},
{
name: "Jane Hunt",
mileage: 2500
}
]
}
52. JSON Example
{
make: "Ford",
model: "Mustang",
year: 2009,
body: "Coupe",
color: "Red",
engine: {
gas_type: "Petrol",
cubic_capacity: 4600
},
previous_owners: [
{
name: "John Smith",
mileage: 1000
},
{
name: "Jane Hunt",
mileage: 2500
}
]
}
54. Create / Delete Database
$ curl http://127.0.0.1:5984
{"couchdb":"Welcome","version":"1.0.1"}
$ curl -X PUT http://127.0.0.1:5984/contacts
{"ok":true}
$ curl -X GET http://127.0.0.1:5984/_all_dbs
["contacts","_users"]
$ curl -X DELETE http://127.0.0.1:5984/contacts
{"ok":true}
55. Manage Documents
$ curl -X PUT http://127.0.0.1:5984/contacts/joaocerdeira -d '{}'
{"ok":true,"id":"joaocerdeira","rev":"1-
967a00dff5e02add41819138abb3284d"}
$ curl -X GET http://127.0.0.1:5984/contacts/joaocerdeira
{"_id":"joaocerdeira","_rev":"1-
967a00dff5e02add41819138abb3284d"}
$ curl -X DELETE http://127.0.0.1:5984/contacts/joaocerdeira?rev=1-
967a00dff5e02add41819138abb3284d
{"ok":true,"id":"joaocerdeira","rev":"2-
eec205a9d413992850a6e32678485900"}
56. Manage Documents
$ curl -X PUT http://127.0.0.1:5984/contacts/joaocerdeira -d
'{"firstName":"Joao","lastName":"Cerdeira","email":"
cerdeira@gmail.com"}'
{"ok":true,"id":"joaocerdeira","rev":"1-
186fe12b748c40559e8f234d8e566c18"}
$ curl -X GET http://127.0.0.1:5984/contacts/joaocerdeira
{"_id":"joaocerdeira","_rev":"1-
186fe12b748c40559e8f234d8e566c18","firstName":"Joao","lastNam
e":"Cerdeira","email":"cerdeira@gmail.com"}
57. Copy Documents
$ curl -X COPY http://127.0.0.1:5984/contacts/joaocerdeira -H
"Destination: batatinha"
{"id":"batatinha","rev":"1-186fe12b748c40559e8f234d8e566c18"}
$ curl -X GET http://127.0.0.1:5984/contacts/batatinha
{"_id":"batatinha","_rev":"1-
186fe12b748c40559e8f234d8e566c18","firstName":"Joao","lastNam
e":"Cerdeira","email":"cerdeira@gmail.com"}
58. Changing Documents
$ curl -X PUT http://127.0.0.1:5984/contacts/batatinha -d '{"_rev":"1-
186fe12b748c40559e8f234d8e566c18","firstName":"Clown","lastNa
me":"Batatinha","email":["batatinha@bataton.pt
","batatinha@first.to.exit@rtp.pt"], "phone":"93 1234567"}'
{"ok":true,"id":"batatinha","rev":"2-
b7079a6d71179b1571652059355d84c3"}
$ curl -X GET http://127.0.0.1:5984/contacts/batatinha
{"_id":"batatinha","_rev":"2-
b7079a6d71179b1571652059355d84c3","firstName":"Clown","lastNa
me":"Batatinha","email":["batatinha@bataton.pt
","batatinha@first.to.exit@rtp.pt"], "phone":"93 1234567"}
60. Designing Documents
{
"_id":"joaocerdeira",
"_rev":"1-186fe12b748c40559e8f234d8e566c18",
“doctype”:”contact”
"firstName":"Joao",
"lastName":"Cerdeira",
“company”:”MULTICERT”
"emails":[
{
“type”:”personal”,
“email”:"cerdeira@gmail.com“
},
{
“type”:”business”,
“email”:"joao.cerdeira@multicert.com“
}
],
“phones”:[
{
“type”:”personal”,
“phone”:"93 1234567“
},
{
“type”:”business”,
“phone”:"93 7654321“
}
]
}
61. Designing Documents
{
"_id":"joaocerdeira",
"_rev":"1-186fe12b748c40559e8f234d8e566c18",
“doctype”:”contact”
"firstName":"Joao",
"lastName":"Cerdeira",
“company”:”MULTICERT”
"emails":[
{
“type”:”personal”,
“email”:"cerdeira@gmail.com“
},
{
“type”:”business”,
“email”:"joao.cerdeira@multicert.com“
}
],
“phones”:[
{
“type”:”personal”,
“phone”:"93 1234567“
},
{
“type”:”business”,
“phone”:"93 7654321“
}
]
}
62. Designing Documents
{
"_id":"joaocerdeira",
"_rev":"1-186fe12b748c40559e8f234d8e566c18",
“doctype”:”contact”
"firstName":"Joao",
"lastName":"Cerdeira",
“company”:”MULTICERT”
"emails":[
{
“type”:”personal”,
“email”:"cerdeira@gmail.com“
},
{
“type”:”business”,
“email”:"joao.cerdeira@multicert.com“
}
],
“phones”:[
{
“type”:”personal”,
“phone”:"93 1234567“
},
{
“type”:”business”,
“phone”:"93 7654321“
}
]
}
63. Designing Documents
{
"_id":"joaocerdeira",
"_rev":"1-186fe12b748c40559e8f234d8e566c18",
“doctype”:”contact”
"firstName":"Joao",
"lastName":"Cerdeira",
“company”:”MULTICERT”
"emails":[
{
“type”:”personal”,
“email”:"cerdeira@gmail.com“
},
{
“type”:”business”,
“email”:"joao.cerdeira@multicert.com“
}
],
“phones”:[
{
“type”:”personal”,
“phone”:"93 1234567“
},
{
“type”:”business”,
“phone”:"93 7654321“
}
]
}
66. Quer ing CouchDB
Quer ies in JavaScr ipt
Use Map/Reduce for quer ing
For simple quer ies Map/Reduce isn't
needed
Don't have joins (but you can have similar)
67. Simple Views
List All Documents
function(doc){
emit(doc._id,doc);
}
List All Documents
Of type 'vip'
function(doc){
If (doc.type=='vip'){
emit(doc._id,doc);
}
}
68. Temp Views
$ curl -X POST -H "Content-type: application/json"
http://127.0.0.1:5984/contacts/_temp_view -d '{"map":"function(doc)
{emit(doc._id,doc);}"}'
{"total_rows":2,"offset":0,"rows":[
{"id":"batatinha","key":"batatinha","value":{"_id":"batatinha","_rev":"2-
b7079a6d71179b1571652059355d84c3","firstName":"Palhaco","lastName
":"Batatinha","email":
["batatinha@bataton.pt","batatinha@first.to.exit@rtp.pt"],"phone":"93
1234567"}},
{"id":"joaocerdeira","key":"joaocerdeira","value":
{"_id":"joaocerdeira","_rev":"1-
186fe12b748c40559e8f234d8e566c18","firstName":"Joao","lastName":"C
erdeira","email":"cerdeira@gmail.com","_deleted_conflicts":["2-
eec205a9d413992850a6e32678485900"]}}
69. Normal Views
{
"_id" : "_design/example",
"views" : {
"foo" : {
"map":"function(doc){emit(doc._id,doc);}"
}
}
}
$ curl -X PUT -H "Content-type: application/json"
http://127.0.0.1:5984/contacts/_design/example -d @design_simple1.json
70. Normal Views
$ curl -X GET http://127.0.0.1:5984/contacts/_design/example/_view/foo
{"total_rows":2,"offset":0,"rows":[
{"id":"batatinha","key":"batatinha","value":{"_id":"batatinha","_rev":"2-
b7079a6d71179b1571652059355d84c3","firstName":"Palhaco","lastName"
:"Batatinha","email":
["batatinha@bataton.pt","batatinha@primeiro.a.sair@rtp.pt"],"phone":"93
1234567"}},
{"id":"joaocerdeira","key":"joaocerdeira","value":
{"_id":"joaocerdeira","_rev":"1-
186fe12b748c40559e8f234d8e566c18","firstName":"Jou00e3o","lastNam
e":"Cerdeira","email":"cerdeira@gmail.com","_deleted_conflicts":["2-
eec205a9d413992850a6e32678485900"]}}
]}
72. Map/Reduce Views
{"_id" : "_design/example",
"views" : {
…...................................
"bar" : {
"map":"function(doc){emit(doc,1);}",
"reduce":"function(keys, values, rereduce) {
return sum(values);}"
}}}
$ curl -X GET http://127.0.0.1:5984/contacts/_design/example/_view/bar
{"rows":[
{"key":null,"value":7}
]}
73. Map/Reduce Views
{"_id" : "_design/example",
"views" : {
…...................................
""aggreg" : {
"map":"function(doc){if(doc.country){emit(doc.country,1);}}",
"reduce":"function(keys, values, rereduce) {return sum(values);}"
}
$ curl -X GET
http://127.0.0.1:5984/contacts/_design/example/_view/aggreg?group=true
{"rows":[
{"key":"England","value":1},
{"key":"Portugal","value":2},
{"key":"US","value":2}
]}
78. One Time Replication
$ curl -H "Content-type: application/json
-X POST http://127.0.0.1:5984/_replicate
-d '{"source":"contacts","target":"contacts-replica"}'
{"ok":true,
"session_id":"00872a440fdda973d6a9a18f2f571bb8",
"source_last_seq":19,
"history": [{"session_id":"00872a440fdda973d6a9a18f2f571bb8",
"start_time":"Tue, 05 Jul 2011 23:03:32 GMT",
"end_time":"Tue, 05 Jul 2011 23:03:32 GMT",
"start_last_seq":0,
"end_last_seq":19,
"recorded_seq":19,
"missing_checked":0,
"missing_found":8,
"docs_read":12,
"docs_written":12,
"doc_write_failures":0}]}
80. Continuous Replication
$ curl -vX POST http://127.0.0.1:5984/_replicate
-d '{
"source":"http://127.0.0.1:5984/contacts",
"target":"http://127.0.0.1:5984/contacts-replica",
"continuous":true
}'
81. Read Wr ite
Wr ite Wr ite
Wr ite
White Read
82. Load Balancing
Caching
It's HTTP. So use the tools you know
-> NGINX
-> Squid
-> Apache mod_proxy
-> …....
83. Conflict Resolution
Library
http://thetowersofjacksonville.com/photogallery/photo12411/real.html
http://thetowersofjacksonville.com/photogallery/photo12411/real.htm
88. Conflicts Resolution
function(doc) {
if(doc._conflicts) {
emit(doc._conflicts, null);}
}
{"total_rows":1,"offset":0,"rows":[
{"id":"identifier","key":["2-
7c971bb974251ae8541b8fe045964219"],"value":null}
]}
$ curl -X DELETE $HOST/db-replica/identifier?rev=2-
de0ea16f8621cbac506d23a0fbbde08a
{"ok":true,"id":"identifier","rev":"3-bfe83a296b0445c4d526ef35ef62ac14"}
$ curl -X PUT $HOST/db-replica/identifier
-d '{"count":3,"_rev":"2-7c971bb974251ae8541b8fe045964219"}'
{"ok":true,"id":"identifier","rev":"3-5d0319b075a21b095719bc561def7122"}
90. Clients
JavaScript : Jquery CouchDB Library
.Net : Relax
Java : CouchDB4J
Perl : CouchDB::Client Net::CouchDb
Ruby : CouchRest
Python : couchdb-python
Scala : scouchdb
And so much more ...
91. CouchDB
In
Mobile
http://www.digitaljournal.com/article/261153
95. Own Your Data
I like services like google but what about
my privacy ?!
I think CouchDB is the way to own my data
96. Partition with Cluster
http://thetowersofjacksonville.com/photogallery/photo12411/real.htm
100. We need a MindSet
Change
Stop seing all
the data in the
world as
relational data