This document provides an overview of CouchDB, including its document-based structure without predefined schemas, RESTful HTTP API, use of JSON and JavaScript for querying and updating data, and focus on availability and flexibility over transactions. It also introduces PHPillow, a PHP library that provides CouchDB integration and abstraction.
Machine Learning Model Validation (Aijun Zhang 2024).pdf
CouchDB, PHPillow & PHP
1. CouchDB, PHP & PHPillow
http://joind.in/1465
Kore Nordmann
<kore@php.net>
@koredn
February 26, 2010
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
2. 2 / 70
About me
Kore Nordmann, <kore@php.net>
Long time PHP developer
Regular speaker, author, etc.
Studies computer science in Dortmund, currently writing
thesis
Active open source developer:
eZ Components (Graph, WebDav, Document), Arbit,
PHPUnit, Torii, PHPillow, KaForkL, Image 3D, WCV, ...
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
3. 3 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
4. 4 / 70
CouchDB is paradigmn shift
Structure
Consistency
API
Applications
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
5. 4 / 70
CouchDB is paradigmn shift
Structure
Consistency
API
Applications
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
6. 4 / 70
CouchDB is paradigmn shift
Structure
Consistency
API
Applications
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
7. 4 / 70
CouchDB is paradigmn shift
Structure
Consistency
API
Applications
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
8. 4 / 70
CouchDB is paradigmn shift
Structure
Consistency
API
Applications
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
9. 5 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
10. 6 / 70
CouchDB
Apache top-level project
Build in Erlang / on Erlang/OTP
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
11. 6 / 70
CouchDB
Apache top-level project
Build in Erlang / on Erlang/OTP
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
12. 6 / 70
CouchDB
Apache top-level project
Build in Erlang / on Erlang/OTP
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
13. 7 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
14. 8 / 70
Structure
Document based database
No pre-defined structure (tables)
Put in any JSON object you want
Even deep structures (arrays of objects)
You may attach any number of files to documents
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
15. 8 / 70
Structure
Document based database
No pre-defined structure (tables)
Put in any JSON object you want
Even deep structures (arrays of objects)
You may attach any number of files to documents
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
16. 8 / 70
Structure
Document based database
No pre-defined structure (tables)
Put in any JSON object you want
Even deep structures (arrays of objects)
You may attach any number of files to documents
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
17. 8 / 70
Structure
Document based database
No pre-defined structure (tables)
Put in any JSON object you want
Even deep structures (arrays of objects)
You may attach any number of files to documents
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
18. 8 / 70
Structure
Document based database
No pre-defined structure (tables)
Put in any JSON object you want
Even deep structures (arrays of objects)
You may attach any number of files to documents
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
19. 9 / 70
Structure
Example wiki document
1 { ” t i t l e ”: ”PHPUK 2 0 1 0 ” ,
2 ” text ”: ” Welcome t o t h e PHPUK . . . ” ,
3 ” creator ”: ” u s e r −b a r ” ,
4 ” edited ”: 2935678239 ,
5 ” r e v i s i o n s ”: [
6 { ” t i t l e ”: ”PHPUK 2 0 0 9 ” ,
7 ” text ”: ” Welcome t o t h e PHPUK . . . ” ,
8 ” c r e a t o r ” : ” u s e r −f o o ” ,
9 ” e d i t e d ” : 2935678183 ,
10 ]
11 },
12 ...
13 ]
14 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
20. 10 / 70
Attachements
Example wiki document
1 { ” t i t l e ”: ”PHPUK 2 0 1 0 ” ,
2 ” creator ”: ” u s e r −b a r ” ,
3 ” text ”: ”<h1>Welcome t o t h e PHPUK</h1>
4
5 <img s r c =” p h p u k 2 0 1 0 / l o g o . png ” a l t =”
PHPUK l o g o ”/>
6 ...” ,
7 ” attachments ”: {
8 ” l o g o . png ” : {
9 ” c o n t e n t t y p e ” : ” image / png ” ,
10 ” stub ”: true ,
11 ” length ”: 42 ,
12 }
13 }
14 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
21. 11 / 70
Migration / Refactoring
Change document structure at any time
No need for non-transaction-safe Data Definition Language
(DDL)
Fits rapid development approaches with common customer
requested changes to the data structure
You need to handle this in your application properly, of course:
Incrementally update structure on modification
Liberal validation on read
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
22. 11 / 70
Migration / Refactoring
Change document structure at any time
No need for non-transaction-safe Data Definition Language
(DDL)
Fits rapid development approaches with common customer
requested changes to the data structure
You need to handle this in your application properly, of course:
Incrementally update structure on modification
Liberal validation on read
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
23. 11 / 70
Migration / Refactoring
Change document structure at any time
No need for non-transaction-safe Data Definition Language
(DDL)
Fits rapid development approaches with common customer
requested changes to the data structure
You need to handle this in your application properly, of course:
Incrementally update structure on modification
Liberal validation on read
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
24. 11 / 70
Migration / Refactoring
Change document structure at any time
No need for non-transaction-safe Data Definition Language
(DDL)
Fits rapid development approaches with common customer
requested changes to the data structure
You need to handle this in your application properly, of course:
Incrementally update structure on modification
Liberal validation on read
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
25. 12 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
26. 13 / 70
HTTP API
RESTful HTTP access
HTTP is available on “all” platforms natively
No PHP extension required
Just use PHPs HTTP stream wrapper, pecl/http or curl
You can use all your known HTTP middleware
Reverse proxies for scaling reads (Varnish, Squid)
Simple custom proxy configuration for direct “AJAX” access
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
27. 13 / 70
HTTP API
RESTful HTTP access
HTTP is available on “all” platforms natively
No PHP extension required
Just use PHPs HTTP stream wrapper, pecl/http or curl
You can use all your known HTTP middleware
Reverse proxies for scaling reads (Varnish, Squid)
Simple custom proxy configuration for direct “AJAX” access
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
28. 13 / 70
HTTP API
RESTful HTTP access
HTTP is available on “all” platforms natively
No PHP extension required
Just use PHPs HTTP stream wrapper, pecl/http or curl
You can use all your known HTTP middleware
Reverse proxies for scaling reads (Varnish, Squid)
Simple custom proxy configuration for direct “AJAX” access
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
29. 14 / 70
HTTP API
GET / POST / PUT / DELETE
<METHOD> http://<host>/<database>/<document>
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
30. 15 / 70
Create a database
1 $ c u r l − i −X PUT h t t p : / / l o c a l h o s t : 5 9 8 4 / p h p u k w i k i
2
3 HTTP/ 1 . 1 201 C r e a t e d
4 S e r v e r : CouchDB / 0 . 1 0 . 0 ( E r l a n g OTP/R13B )
5 Location : http :// l o c a l h o s t :5984/ phpuk wiki
6 Date : F r i , 13 Jan 2009 1 4 : 0 7 : 5 7 GMT
7 Content −Type : t e x t / p l a i n ; c h a r s e t=u t f −8
8 Content −L e n g t h : 12
9 Cache−C o n t r o l : must−r e v a l i d a t e
10
11 {” ok ” : t r u e }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
31. 16 / 70
Create a wiki document
1 $ c u r l − i −X PUT h t t p : / / l o c a l h o s t : 5 9 8 4 / p h p u k w i k i / S t a r t
−−d a t a ’ { ” name ” : ” S t a r t ” , ” t e x t ” : ” H e l l o World ! ” } ’
2
3 HTTP/ 1 . 1 201 C r e a t e d
4 S e r v e r : CouchDB / 0 . 1 0 . 0 ( E r l a n g OTP/R13B )
5 Location : http :// l o c a l h o s t :5984/ phpuk wiki / Start
6 Etag : ”1−6 b f d 4 8 8 5 b 6 c 6 2 b b 5 1 6 9 a 1 9 d 5 a 8 1 9 2 7 e 3 ”
7 Date : F r i , 13 Jan 2009 1 4 : 1 4 : 5 5 GMT
8 Content −Type : t e x t / p l a i n ; c h a r s e t=u t f −8
9 Content −L e n g t h : 68
10 Cache−C o n t r o l : must−r e v a l i d a t e
11
12 {” ok ” : t r u e , ” i d ” : ” S t a r t ” , ” r e v ”:”1 −6
b f d 4 8 8 5 b 6 c 6 2 b b 5 1 6 9 a 1 9 d 5 a 8 1 9 2 7 e 3 ”}
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
32. 17 / 70
Get a wiki document back
1 $ c u r l − i −X GET h t t p : / / l o c a l h o s t : 5 9 8 4 / p h p u k w i k i / S t a r t
2
3 HTTP/ 1 . 1 200 OK
4 S e r v e r : CouchDB / 0 . 1 0 . 0 ( E r l a n g OTP/R13B )
5 Etag : ”1−6 b f d 4 8 8 5 b 6 c 6 2 b b 5 1 6 9 a 1 9 d 5 a 8 1 9 2 7 e 3 ”
6 Date : F r i , 13 Jan 2009 1 4 : 1 5 : 4 8 GMT
7 Content −Type : t e x t / p l a i n ; c h a r s e t=u t f −8
8 Content −L e n g t h : 97
9 Cache−C o n t r o l : must−r e v a l i d a t e
10
11 {” i d ” : ” S t a r t ” , ” r e v ”:”1 −6
b f d 4 8 8 5 b 6 c 6 2 b b 5 1 6 9 a 1 9 d 5 a 8 1 9 2 7 e 3 ” , ” name ” : ” S t a r t ” , ”
t e x t ” : ” H e l l o World ! ” }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
33. 18 / 70
Security
Simple database based access restrictions
Using HTTP plain auth
More fine grained access control will be in next release
Define functions which decide if a request from a user will be
accepted.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
34. 18 / 70
Security
Simple database based access restrictions
Using HTTP plain auth
More fine grained access control will be in next release
Define functions which decide if a request from a user will be
accepted.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
35. 18 / 70
Security
Simple database based access restrictions
Using HTTP plain auth
More fine grained access control will be in next release
Define functions which decide if a request from a user will be
accepted.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
36. 19 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
37. 20 / 70
Data access
How to query such a mess?
Views are small scripts, run for all documents in a database
Views are built iteratively, results stored in BTrees
Once built, they are fast
Mostly JavaScript, but also PHP, Ruby, Perl, Erlang, ...
A view may emit any number of key-value pairs for each
document
Key and value may be any JSON structure
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
38. 20 / 70
Data access
How to query such a mess?
Views are small scripts, run for all documents in a database
Views are built iteratively, results stored in BTrees
Once built, they are fast
Mostly JavaScript, but also PHP, Ruby, Perl, Erlang, ...
A view may emit any number of key-value pairs for each
document
Key and value may be any JSON structure
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
39. 20 / 70
Data access
How to query such a mess?
Views are small scripts, run for all documents in a database
Views are built iteratively, results stored in BTrees
Once built, they are fast
Mostly JavaScript, but also PHP, Ruby, Perl, Erlang, ...
A view may emit any number of key-value pairs for each
document
Key and value may be any JSON structure
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
40. 20 / 70
Data access
How to query such a mess?
Views are small scripts, run for all documents in a database
Views are built iteratively, results stored in BTrees
Once built, they are fast
Mostly JavaScript, but also PHP, Ruby, Perl, Erlang, ...
A view may emit any number of key-value pairs for each
document
Key and value may be any JSON structure
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
41. 20 / 70
Data access
How to query such a mess?
Views are small scripts, run for all documents in a database
Views are built iteratively, results stored in BTrees
Once built, they are fast
Mostly JavaScript, but also PHP, Ruby, Perl, Erlang, ...
A view may emit any number of key-value pairs for each
document
Key and value may be any JSON structure
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
42. 20 / 70
Data access
How to query such a mess?
Views are small scripts, run for all documents in a database
Views are built iteratively, results stored in BTrees
Once built, they are fast
Mostly JavaScript, but also PHP, Ruby, Perl, Erlang, ...
A view may emit any number of key-value pairs for each
document
Key and value may be any JSON structure
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
43. 21 / 70
Wiki
Index all wiki documents by their title
1 f u n c t i o n ( doc )
2 {
3 i f ( doc . t i t l e && doc . t e x t )
4 {
5 e m i t ( doc . t i t l e , doc . i d ) ;
6 }
7 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
44. 22 / 70
Wiki
Index all wiki documents by their title
1 f u n c t i o n ( doc )
2 {
3 i f ( doc . t y p e == ” w i k i ” )
4 {
5 e m i t ( doc . t i t l e , doc . i d ) ;
6 }
7 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
45. 23 / 70
Wiki
Index all documents by their title
1 ” BuildModuleDesign ” => ” w i k i −b u i l d m o d u l e d e s i g n ”
2 ” CodingGuidelines ” => ” w i k i −c o d i n g g u i d e l i n e s ”
3 ” DiscussionProtocols ” => ” w i k i −d i s c u s s i o n p r o t o c o l s ”
4 ” ModuleDesign ” => ” w i k i −m o d u l e d e s i g n ”
5 ” Protocol 08 02 07 ” => ” w i k i −p r o t o c o l 0 8 0 2 0 7 ”
6 ” VCSModuleDesign ” => ” w i k i −v c s m o d u l e d e s i g n ”
7 ...
Custom deteministic IDs can ensure uniqueness of documents
Just set the id property on insert.
CouchDB can also generate IDs for you
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
46. 23 / 70
Wiki
Index all documents by their title
1 ” BuildModuleDesign ” => ” w i k i −b u i l d m o d u l e d e s i g n ”
2 ” CodingGuidelines ” => ” w i k i −c o d i n g g u i d e l i n e s ”
3 ” DiscussionProtocols ” => ” w i k i −d i s c u s s i o n p r o t o c o l s ”
4 ” ModuleDesign ” => ” w i k i −m o d u l e d e s i g n ”
5 ” Protocol 08 02 07 ” => ” w i k i −p r o t o c o l 0 8 0 2 0 7 ”
6 ” VCSModuleDesign ” => ” w i k i −v c s m o d u l e d e s i g n ”
7 ...
Custom deteministic IDs can ensure uniqueness of documents
Just set the id property on insert.
CouchDB can also generate IDs for you
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
47. 23 / 70
Wiki
Index all documents by their title
1 ” BuildModuleDesign ” => ” w i k i −b u i l d m o d u l e d e s i g n ”
2 ” CodingGuidelines ” => ” w i k i −c o d i n g g u i d e l i n e s ”
3 ” DiscussionProtocols ” => ” w i k i −d i s c u s s i o n p r o t o c o l s ”
4 ” ModuleDesign ” => ” w i k i −m o d u l e d e s i g n ”
5 ” Protocol 08 02 07 ” => ” w i k i −p r o t o c o l 0 8 0 2 0 7 ”
6 ” VCSModuleDesign ” => ” w i k i −v c s m o d u l e d e s i g n ”
7 ...
Custom deteministic IDs can ensure uniqueness of documents
Just set the id property on insert.
CouchDB can also generate IDs for you
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
48. 24 / 70
Map-reduce-views
“MapReduce is a software framework introduced by Google to
support distributed computing on large data sets on clusters
of computers.” [Wik09]
Used by CouchDB to implement views
Just a framework / pattern: You can implement “any”
algorithm using map-reduce.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
49. 24 / 70
Map-reduce-views
“MapReduce is a software framework introduced by Google to
support distributed computing on large data sets on clusters
of computers.” [Wik09]
Used by CouchDB to implement views
Just a framework / pattern: You can implement “any”
algorithm using map-reduce.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
50. 24 / 70
Map-reduce-views
“MapReduce is a software framework introduced by Google to
support distributed computing on large data sets on clusters
of computers.” [Wik09]
Used by CouchDB to implement views
Just a framework / pattern: You can implement “any”
algorithm using map-reduce.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
51. 25 / 70
Map-Reduce
Documents
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
55. 29 / 70
Map-Reduce
Map and reduce functions are custom
Reduce is optional, plain view serves as a document index
Reduce may be applied to subsets of the documents
Reduce may be grouped
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
56. 29 / 70
Map-Reduce
Map and reduce functions are custom
Reduce is optional, plain view serves as a document index
Reduce may be applied to subsets of the documents
Reduce may be grouped
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
57. 29 / 70
Map-Reduce
Map and reduce functions are custom
Reduce is optional, plain view serves as a document index
Reduce may be applied to subsets of the documents
Reduce may be grouped
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
58. 29 / 70
Map-Reduce
Map and reduce functions are custom
Reduce is optional, plain view serves as a document index
Reduce may be applied to subsets of the documents
Reduce may be grouped
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
59. 30 / 70
Map-Reduce example
The simplest reduce function is just count()
Often used for statistics
1 f u n c t i o n ( k e y s , v a l u e s , combine )
2 {
3 r e t u r n sum ( v a l u e s ) ;
4 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
60. 31 / 70
Map-Reduce example
The reduce result
1 null => 42
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
61. 32 / 70
More...
What should be covered in more depth?
Map-reduce views in CouchDB
Scalability / consistency in CouchDB
The PHPillow API
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
62. 33 / 70
Map-Reduce example
The map function
1 f u n c t i o n ( doc )
2 {
3 i f ( doc . t y p e == ” w i k i ” )
4 {
5 d a t e = new Date ( ) ;
6 d a t e . s e t T i m e ( doc . e d i t e d ∗ 1000 ) ;
7 emit ( [
8 date . getUTCFullYear ( ) ,
9 d a t e . getUTCMonth ( ) + 1 ,
10 d a t e . getUTCDate ( ) ,
11 d a t e . getUTCHours ( ) ,
12 d a t e . getUTCMinutes ( ) ,
13 d a t e . getUTCSeconds ( ) ,
14 ], 1 );
15 // You c o u l d a l s o e m i t t h e w h o l e doc a s v a l u e
16 }
17 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
64. 35 / 70
Map-Reduce example
The reduce function
1 f u n c t i o n ( k e y s , v a l u e s , combine )
2 {
3 r e t u r n sum ( v a l u e s ) ;
4 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
65. 36 / 70
Map-Reduce example
The reduce result
1 null => 12
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
68. 39 / 70
Map-Reduce example
The grouped reduce result, with group level
group-level=3
1 [2008 , 10 , 11] => 6
2 [2008 , 10 , 12] => 3
3 [2008 , 10 , 14] => 3
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
69. 40 / 70
Full-Text-Search
Index all documents by all their words
1 f u n c t i o n ( doc ) {
2 i f ( doc . t y p e == ” w i k i ” ) {
3 // S i m p l e word i n d e x i n g , d o e s n o t r e s p e c t o v e r a l l
o c c u r e n c e s o f words ,
4 // s t o p w o r d s , d i f f e r e n t word s e p e r a t i o n c h a r a c t e r s ,
o r word v a r i a t i o n s .
5 v a r t e x t = doc . t i t l e . r e p l a c e ( / [ s : . , ! ? − ] + / g , ” ”
) +
6 doc . t e x t . r e p l a c e ( / [ s : . , ! ? − ] + / g , ” ” )
;
7 v a r words = t e x t . s p l i t ( ” ” ) ;
8 f o r ( v a r i = 0 ; i < words . l e n g t h ; ++i ) {
9 value = {};
10 v a l u e [ doc . i d ] = 1 ;
11 e m i t ( words [ i ] . t o L o w e r C a s e ( ) , v a l u e ) ;
12 }
13 }
14 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
70. 41 / 70
Wiki
Index all documents by all their words
1 ...
2 ”a” => { wiki −8: 1}
3 ”a” => { wiki −8: 1}
4 ”a” => { wiki −8: 1}
5 ”a” => { wiki −8: 1}
6 ”a” => { wiki −81: 1}
7 ”a” => { wiki −83: 1}
8 ”a” => { wiki −83: 1}
9 ” able ” => { wiki −39: 1}
10 ” able ” => { wiki −56: 1}
11 ” able ” => { wiki −73: 1}
12 ” able ” => { wiki −80: 1}
13 ” about ” => { wiki −24: 1}
14 ” about ” => { wiki −43: 1}
15 ” about ” => { wiki −85: 1}
16 ...
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
71. 42 / 70
Full-Text-Search
Reduce by word count
1 f u n c t i o n ( keys , v a l u e s ) {
2 var count = {};
3 for ( var i in values ) {
4 for ( var id in values [ i ] ) {
5 i f ( count [ i d ] ) {
6 count [ i d ] = v a l u e s [ i ] [ i d ] + count [ i d ] ;
7 } else {
8 count [ i d ] = v a l u e s [ i ] [ i d ] ;
9 }
10 }
11 }
12 return count ;
13 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
72. 43 / 70
Wiki
Index all documents by all their words
1 ...
2 ”a” => {
3 wiki −68: 6,
4 wiki −66: 6,
5 wiki −22: 4,
6 wiki −63: 3,
7 wiki −60: 2,
8 wiki −35: 2,
9 wiki −34: 1,
10 wiki −31: 1,
11 ...
12 }
13 ” a b l e ” => { w i k i −86: 1 , w i k i −80: 1 , w i k i −73: 1 , w i k i
−56: 1 , w i k i −39: 1}
14 ” a b o u t ” => { w i k i −85: 1 , w i k i −43: 1 , w i k i −24: 1}
15 ...
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
73. 44 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
74. 45 / 70
Local conflicts
Multi-Version Concurrency Control
All documents in the database are versioned
Don’t use it for application level document versioning
Updates and deletes need to specify the revision ID
Changing outdated documents result in conflicts
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
75. 45 / 70
Local conflicts
Multi-Version Concurrency Control
All documents in the database are versioned
Don’t use it for application level document versioning
Updates and deletes need to specify the revision ID
Changing outdated documents result in conflicts
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
76. 45 / 70
Local conflicts
Multi-Version Concurrency Control
All documents in the database are versioned
Don’t use it for application level document versioning
Updates and deletes need to specify the revision ID
Changing outdated documents result in conflicts
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
77. 45 / 70
Local conflicts
Multi-Version Concurrency Control
All documents in the database are versioned
Don’t use it for application level document versioning
Updates and deletes need to specify the revision ID
Changing outdated documents result in conflicts
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
78. 45 / 70
Local conflicts
Multi-Version Concurrency Control
All documents in the database are versioned
Don’t use it for application level document versioning
Updates and deletes need to specify the revision ID
Changing outdated documents result in conflicts
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
79. 46 / 70
Inter document links
There is no ensured inter document consistency in CouchDB
Different possibilities of relating documents:
List IDs of related documents in document (n:m)
... both directions are feasible
Embed the whole related document (1:n)
Solution depends on update-ratio
1 { ” type ”: ” wiki ” ,
2 ” t i t l e ”: ” Hello world ” ,
3 ” text ”: ”...” ,
4 ” comments ” : [
5 { ”comment ” : ” . . . ” } ,
6 ],
7 ” c r e a t o r ” : ” u s e r −f o o ” ,
8 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
80. 46 / 70
Inter document links
There is no ensured inter document consistency in CouchDB
Different possibilities of relating documents:
List IDs of related documents in document (n:m)
... both directions are feasible
Embed the whole related document (1:n)
Solution depends on update-ratio
1 { ” type ”: ” wiki ” ,
2 ” t i t l e ”: ” Hello world ” ,
3 ” text ”: ”...” ,
4 ” comments ” : [
5 { ”comment ” : ” . . . ” } ,
6 ],
7 ” c r e a t o r ” : ” u s e r −f o o ” ,
8 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
81. 46 / 70
Inter document links
There is no ensured inter document consistency in CouchDB
Different possibilities of relating documents:
List IDs of related documents in document (n:m)
... both directions are feasible
Embed the whole related document (1:n)
Solution depends on update-ratio
1 { ” type ”: ” wiki ” ,
2 ” t i t l e ”: ” Hello world ” ,
3 ” text ”: ”...” ,
4 ” comments ” : [
5 { ”comment ” : ” . . . ” } ,
6 ],
7 ” c r e a t o r ” : ” u s e r −f o o ” ,
8 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
82. 46 / 70
Inter document links
There is no ensured inter document consistency in CouchDB
Different possibilities of relating documents:
List IDs of related documents in document (n:m)
... both directions are feasible
Embed the whole related document (1:n)
Solution depends on update-ratio
1 { ” type ”: ” wiki ” ,
2 ” t i t l e ”: ” Hello world ” ,
3 ” text ”: ”...” ,
4 ” comments ” : [
5 { ”comment ” : ” . . . ” } ,
6 ],
7 ” c r e a t o r ” : ” u s e r −f o o ” ,
8 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
83. 46 / 70
Inter document links
There is no ensured inter document consistency in CouchDB
Different possibilities of relating documents:
List IDs of related documents in document (n:m)
... both directions are feasible
Embed the whole related document (1:n)
Solution depends on update-ratio
1 { ” type ”: ” wiki ” ,
2 ” t i t l e ”: ” Hello world ” ,
3 ” text ”: ”...” ,
4 ” comments ” : [
5 { ”comment ” : ” . . . ” } ,
6 ],
7 ” c r e a t o r ” : ” u s e r −f o o ” ,
8 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
84. 46 / 70
Inter document links
There is no ensured inter document consistency in CouchDB
Different possibilities of relating documents:
List IDs of related documents in document (n:m)
... both directions are feasible
Embed the whole related document (1:n)
Solution depends on update-ratio
1 { ” type ”: ” wiki ” ,
2 ” t i t l e ”: ” Hello world ” ,
3 ” text ”: ”...” ,
4 ” comments ” : [
5 { ”comment ” : ” . . . ” } ,
6 ],
7 ” c r e a t o r ” : ” u s e r −f o o ” ,
8 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
85. 47 / 70
Scaling: The CAP theorem
The CAP theorem, read more in “CouchDB: The Definitive
Guide” [JCA09]
CouchDB
Partition
Availability
tolerance eventual
consistency
PAXON RDBMS
consensus enforced
Consistency
protocols consistency
CouchDB employs “Eventual Consistency” [Vog09]
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
86. 48 / 70
Eventual consistency
Europe
Client /
Web-Application
Asia Amerika
Delayed, triggered synchronization (push, pull)
Deterministic (manual) conflict resolution on replication on all
nodes
Scales well for seldom concurrent writes
Structure your documents accordingly
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
87. 48 / 70
Eventual consistency
Europe
Client /
Web-Application
Asia Amerika
Delayed, triggered synchronization (push, pull)
Deterministic (manual) conflict resolution on replication on all
nodes
Scales well for seldom concurrent writes
Structure your documents accordingly
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
88. 48 / 70
Eventual consistency
Europe
Client /
Web-Application
Asia Amerika
Delayed, triggered synchronization (push, pull)
Deterministic (manual) conflict resolution on replication on all
nodes
Scales well for seldom concurrent writes
Structure your documents accordingly
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
89. 48 / 70
Eventual consistency
Europe
Asia Amerika
Delayed, triggered synchronization (push, pull)
Deterministic (manual) conflict resolution on replication on all
nodes
Scales well for seldom concurrent writes
Structure your documents accordingly
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
90. 48 / 70
Eventual consistency
Europe
Client /
Web-Application
Asia Amerika
Delayed, triggered synchronization (push, pull)
Deterministic (manual) conflict resolution on replication on all
nodes
Scales well for seldom concurrent writes
Structure your documents accordingly
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
91. 49 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
92. 50 / 70
PHPillow
Object-oriented client for CouchDB
PHP >= 5.2 since last release (5.3 only before)
>96% test coverage
Still in alpha state
Since CouchDB just got “beta” recently, and no new release
was required.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
93. 50 / 70
PHPillow
Object-oriented client for CouchDB
PHP >= 5.2 since last release (5.3 only before)
>96% test coverage
Still in alpha state
Since CouchDB just got “beta” recently, and no new release
was required.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
94. 50 / 70
PHPillow
Object-oriented client for CouchDB
PHP >= 5.2 since last release (5.3 only before)
>96% test coverage
Still in alpha state
Since CouchDB just got “beta” recently, and no new release
was required.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
95. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
96. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
97. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
98. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
99. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
100. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
101. 51 / 70
PHPillow
Lightweight layer
Features
Simple document validation constraints
Automatic synchronization of views
Automatic versioning of documents
couchdb-python compatible tool for dump and import
Different connection handlers
PHP HTTP stream wrapper
Custom HTTP protocol implementation
Which is faster, most likely because of Connection:
Keep-Alive
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
102. 52 / 70
A document class
A custom document class
1 <?php
2 c l a s s myUserDocument e x t e n d s p h p i l l o w D o c u m e n t {
3 protected $versioned = true ;
4
5 // . . .
6 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
103. 53 / 70
A document class
A custom document class
1 <?php
2 c l a s s myUserDocument e x t e n d s p h p i l l o w D o c u m e n t {
3 protected $versioned = true ;
4
5 protected $ r e q u i r e d P r o p e r t i e s = array (
6 ’ login ’ ,
7 );
8
9 // . . .
10 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
104. 54 / 70
A document class
A custom document class
1 <?php
2 c l a s s myUserDocument e x t e n d s p h p i l l o w D o c u m e n t {
3 // . . .
4 public function construct () {
5 $ t h i s −>p r o p e r t i e s = a r r a y (
6 ’ login ’ => new new
p h p i l l o w R e g e x p V a l i d a t o r ( ’ ( ˆ [ x21 −x 7 e
]+$ ) ’ ) ,
7 ’ name ’ => new p h p i l l o w S t r i n g V a l i d a t o r ( ) ,
8 ’ f r i e n d s ’ => new
phpillowDocumentArrayValidator ( ’
myUserDocument ’ ) ,
9 ...
10 );
11 parent : : construct () ;
12 }
13 // . . .
14 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
105. 55 / 70
A document class
A custom document class
1 <?php
2 c l a s s myUserDocument e x t e n d s p h p i l l o w D o c u m e n t {
3 // . . .
4 protected function generateId () {
5 // R e t u r n n u l l f o r auto−g e n e r a t e d I D s
6 r e t u r n $ t h i s −>s t r i n g T o I d ( $ t h i s −>s t o r a g e −>l o g i n
);
7 }
8 // . . .
9 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
106. 56 / 70
A document class
A custom document class
1 <?php
2 c l a s s myUserDocument e x t e n d s p h p i l l o w D o c u m e n t {
3 // . . .
4 p r o t e c t e d f u n c t i o n getType ( ) {
5 return ’ user ’ ;
6 }
7 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
107. 57 / 70
Document examples
Document creation example
1 // C r e a t e a document
2 $doc = new myUserDocument ( ) ;
3 $doc−>l o g i n = ’ k o r e ’ ;
4 $doc−>name = ’ Kore Nordmann ’ ;
5 $doc−>d a t a = a r r a y (
6 ’ m a i l ’ => ” kore@php . n e t ” ,
7 // . . .
8 );
9
10 try {
11 $ i d = $doc−>s a v e ( ) ;
12 } ( p h p i l l o w R e s p o n s e C o n f l i c t E r r o r E x c e p t i o n $e ) {
13 // Document a l r e a d y e x i s t s
14 }
15
16 // F e t c h a document by ID
17 $doc = new myUserDocument ( $ i d ) ;
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
108. 58 / 70
View examples
View class
1 c l a s s myUserView e x t e n d s p h p i l l o w F i l e V i e w {
2 public function construct () {
3 parent : : construct () ;
4
5 $ t h i s −>v i e w F u n c t i o n s = array (
6 ’ a l l ’ => a r r a y (
7 ’ map ’ => ’ map/ u s e r a l l . j s ’ ,
8 ),
9 ’ u s e r ’ => a r r a y (
10 ’ map ’ => ’ map/ u s e r u s e r . j s ’ ,
11 ’ r e d u c e ’ => ’ r e d u c e /sum . j s ’ ,
12 ),
13 );
14 }
15
16 // . . .
17 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
109. 59 / 70
View examples
View class
1 c l a s s myUserView e x t e n d s p h p i l l o w F i l e V i e w {
2 // . . .
3
4 p r o t e c t e d f u n c t i o n getViewName ( ) {
5 return ’ users ’ ;
6 }
7 }
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
110. 60 / 70
View examples
Query a view
1 $ u s e r s = myUserView : : a l l ( a r r a y (
2 ’ k e y ’ => ’ Kore Nordmann ’ ,
3 ) );
PHPillow validates and converts allowed view parameters
What happens:
View function will be uploaded to the database
CouchDB will index all documents in the database using the
view function
View results will be returned as an array
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
111. 60 / 70
View examples
Query a view
1 $ u s e r s = myUserView : : a l l ( a r r a y (
2 ’ k e y ’ => ’ Kore Nordmann ’ ,
3 ) );
PHPillow validates and converts allowed view parameters
What happens:
View function will be uploaded to the database
CouchDB will index all documents in the database using the
view function
View results will be returned as an array
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
112. 60 / 70
View examples
Query a view
1 $ u s e r s = myUserView : : a l l ( a r r a y (
2 ’ k e y ’ => ’ Kore Nordmann ’ ,
3 ) );
PHPillow validates and converts allowed view parameters
What happens:
View function will be uploaded to the database
CouchDB will index all documents in the database using the
view function
View results will be returned as an array
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
113. 60 / 70
View examples
Query a view
1 $ u s e r s = myUserView : : a l l ( a r r a y (
2 ’ k e y ’ => ’ Kore Nordmann ’ ,
3 ) );
PHPillow validates and converts allowed view parameters
What happens:
View function will be uploaded to the database
CouchDB will index all documents in the database using the
view function
View results will be returned as an array
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
114. 61 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
115. 62 / 70
Attachments
CouchDB allows you to attach files to documents
Files are replicated
You can serve full Web-Applications from a CouchDB
See CouchApp
Deploy using PUSH-replication
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
116. 62 / 70
Attachments
CouchDB allows you to attach files to documents
Files are replicated
You can serve full Web-Applications from a CouchDB
See CouchApp
Deploy using PUSH-replication
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
117. 62 / 70
Attachments
CouchDB allows you to attach files to documents
Files are replicated
You can serve full Web-Applications from a CouchDB
See CouchApp
Deploy using PUSH-replication
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
118. 63 / 70
Eventual consistency
Mirror database into userspace
Offline usage and synchronization of Browser applications
Mozilla develops a JavaScript implementation of the
CouchDB API [Moz09]
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
119. 63 / 70
Eventual consistency
Mirror database into userspace
Offline usage and synchronization of Browser applications
Mozilla develops a JavaScript implementation of the
CouchDB API [Moz09]
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
120. 63 / 70
Eventual consistency
Mirror database into userspace
Offline usage and synchronization of Browser applications
Mozilla develops a JavaScript implementation of the
CouchDB API [Moz09]
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
121. 64 / 70
Ubuntu One
Ubuntu One uses CouchDB
Synchronize contacts & date between nodes, or to a server
Yes, all Ubuntu Karmic users already have a CouchDB running
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
122. 64 / 70
Ubuntu One
Ubuntu One uses CouchDB
Synchronize contacts & date between nodes, or to a server
Yes, all Ubuntu Karmic users already have a CouchDB running
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
123. 65 / 70
Various applications
Arbit uses CouchDB for issue tracking, wiki, FAQ and more
Other applications: http:
//wiki.apache.org/couchdb/CouchDB_in_the_wild
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
124. 66 / 70
Summary
CouchDB is fast (enough)
Document oriented approach allows new application
development approaches
CouchDB scales really well, horizontaly and verticaly
CouchDB fits web applications really well
RDBMS are still better for single-cluster scalable applications
with strong integrity requirements.
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
125. 67 / 70
Outline
Introduction
General
Structure
API
Views
Consistency
PHPillow
Applications
QA
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
127. 69 / 70
The end
Open questions?
Further remarks?
Contact
Mail: <kore@php.net>
Web: http://kore-nordmann.de/ (Slides will be available
here soonish)
Twitter: http://twitter.com/koredn
Comment: http://joind.in/1465
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>
128. 70 / 70
Bibliography I
[JCA09] Noah Slater J. Chris Anderson, Jan Lehnardt, Couchdb: The
definitive guide, O’Reilly Media, Inc., 2009.
[Moz09] Mozilla, Browsercouch documentation, November 2009.
[Vog09] Werner Vogels, Eventually consistent - revisited,
http://www.allthingsdistributed.com/2008/12/eventually_
consistent.html, December 2009.
[Wik09] Wikipedia, Mapreduce — wikipedia, the free encyclopedia, 2009,
[Online; accessed 27-August-2009].
http://kore-nordmann.de/portfolio.html
Kore Nordmann <kore@php.net>