SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
Thinking in Documents
   (dropping ACID)


        César D. Rodas
      crodas@member.fsf.org
       http://crodas.org/




     PHP Conference 2009
       Sâo Paulo, Brasil

                              1
Who is this fellow?
         Paraguayan
         Part of the Google Summer of Code 2008
         PHP Classes Innovation Award winner 2007, 2008
         ... and some other few things




@crodas - http://crodas.org/ - L EX
                               AT                           2
Agenda
         How to scale
         The Web's major bottleneck
         NoSQL databases
              • Redis
              • Tokyo Cabinet
              • Cassandra
              • CouchDB
              • MongoDB

         Thinking in documents
              • Data behavior
              • Complex operations

         PHP Integration (The fun part!)
         Map/Reduce (Extra time)
@crodas - http://crodas.org/ - L EX
                               AT              3
Scaling?




@crodas - http://crodas.org/ - L EX
                               AT                4
Increase computational
                      power



@crodas - http://crodas.org/ - L EX
                               AT      5
To make it reliable




@crodas - http://crodas.org/ - L EX
                               AT             6
DISTRIBUTED




@crodas - http://crodas.org/ - L EX
                               AT                   7
How to scale
         Buying more hardware (and connectivity)
         Reverses (threaded) proxies
         DNS round robin for your Reverses proxies
         Gearmand
         Memcached
         and.. What about the data?




@crodas - http://crodas.org/ - L EX
                               AT                    8
How to scale data?




@crodas - http://crodas.org/ - L EX
                               AT             9
The hardest way




@crodas - http://crodas.org/ - L EX
                               AT               10
Scaling RDBMS - Solutions
         Master - Slave replication
         Multi-Master replication
         Data sharding
         DRDB and Heartbeat (RAID-1 over the network)




@crodas - http://crodas.org/ - L EX
                               AT                       11
@crodas - http://crodas.org/ - L EX
                               AT     12
Master-Slave replication
         We need to modify our app
         It worth only if our application is read intense
         It doesn't spread the data across servers
         Single point of failure




@crodas - http://crodas.org/ - L EX
                               AT                           13
Scaling RDBMS - Problems
         SQL
         JOIN
         Autoincrement
         Transactions (ACID)




@crodas - http://crodas.org/ - L EX
                               AT                      14
The easiest way




@crodas - http://crodas.org/ - L EX
                               AT                15
Strong       Consistency, High Availability, Partition-tolerance
                                  Theorem



@crodas - http://crodas.org/ - L EX
                               AT                                            16
BASE
                    Basically Available, Soft state, Eventually Consistent




@crodas - http://crodas.org/ - L EX
                               AT                                            17
Everybody is doing it
         Google
         Amazon
         eBay
         Yahoo!
         Facebook
         ...




@crodas - http://crodas.org/ - L EX
                               AT                             18
Open implementations
         Cassandra
         Redis
         Tokyo Cabinet/Tyrant
         CouchDB
         MongoDB (FTW!)
         ...




@crodas - http://crodas.org/ - L EX
                               AT                       19
Cassandra
         No master (p2p)
         Storage model more like BigTable
         Open source
         Incremental scalable
         PHP interface (with Thrift)
         Never played too much with it.




@crodas - http://crodas.org/ - L EX
                               AT                 20
Key-value




@crodas - http://crodas.org/ - L EX
                               AT                 21
Key-value
         Fast
         Similar to PHP's array
         Simple
         Easy to distribute across machines




@crodas - http://crodas.org/ - L EX
                               AT                 22
Memcached
         It is a key-value store engine used as a cache.
         No persistence(RAM, uses LRU)
         Lightening fast
         Well supported
         *Everybody* is using it
         Several clients for PHP [even I had wrote one ;-)]




@crodas - http://crodas.org/ - L EX
                               AT                             23
Redis
         Very new
         As fast as Memcached
         Persistent to disk
         Very simple protocol
         Support lists and tuples
         Replication
         Operation in the key space
         I loved it!
              • Until I realised it is in-memory DB




@crodas - http://crodas.org/ - L EX
                               AT                     24
Tokyo Tyrant
         Very similar to BerkeleyDB ( dba open() )
         Performs well (I've been playing a bit with it)
         Actively developed
         HTTP Interface (+/-)
         Memcached Protocol (++)
         Going to Document-oriented (supports "tables")




@crodas - http://crodas.org/ - L EX
                               AT                          25
Document-oriented DB




@crodas - http://crodas.org/ - L EX
                               AT      26
http://www.flickr.com/photos/beglen/152027605/


@crodas - http://crodas.org/ - L EX
                               AT                                                 27
What is a "Document"?

<?php
$collection[$id] = array(
   "title" => "PHP rules",
   "tags" => array("php", "web"),
   "body" => "... PHP rules ...",
   "comments" => array(
       array("author" => "crodas", "comment" => "Yes it does"),
   )
);
?>




@crodas - http://crodas.org/ - L EX
                               AT                                 28
Docuement Databases
         Schema free
         Document versioning
         Improved Key-value store
         Great for storing objects




@crodas - http://crodas.org/ - L EX
                               AT                       29
@crodas - http://crodas.org/ - L EX
                               AT     30
CouchDB
         Apache project
         Asynchronous replication
         JSON-based (XML free!)
         RESTful interface (might be bad)
         Views are materialized on demand (not Indexes :-( )
         Cool admin
         Safe IO (Append only)
         Distributed (concurrent) by nature (written in Erlang)



@crodas - http://crodas.org/ - L EX
                               AT                              31
@crodas - http://crodas.org/ - L EX
                               AT     32
@crodas - http://crodas.org/ - L EX
                               AT     33
MongoDB
         Forgot about its name meaning in Portuguese.
         Fast, Fast, Fast
         JSON and BSON (Binary JSON-ish)
         Asynchronous replication, autosharding
         Support indexes (FTW!)
         Nested documents (FTW!)
         Advanced queries (FTW!)
         Native extension for PHP




@crodas - http://crodas.org/ - L EX
                               AT                       34
MongoDB - Advanced
         Select
              • $gt, $lt, $gte, $lte, $eq, $neq: >, <, >=, <=, ==, !=
              • $in, $nin
              • $size, $exists
              • group()
              • limit()
              • skip()
              • ...

         Update
              • $push
              • $pull
              • $inc
              • ...




@crodas - http://crodas.org/ - L EX
                               AT                                       35
pecl install mongo



@crodas - http://crodas.org/ - L EX
                               AT              36
MongoDB - Connection

<?php

/* connects to localhost:27017 */
$connection = new Mongo();

/* connect to a remote host (default port) */
$connection = new Mongo( "example.com" );

/* connect to a remote host at a given port */
$connection = new Mongo( "example.com:65432" );

/* select some DB (and create if it doesn't exits yet) */
$db = $connection->selectDB("db name");

?>



@crodas - http://crodas.org/ - L EX
                               AT                           37
MongoDB - "Tables"

<?php

$db = $connection->selectDB("db name");
$table = $db->getCollection("table");

?>




@crodas - http://crodas.org/ - L EX
                               AT                          38
FROM SQL to MongoDB




@crodas - http://crodas.org/ - L EX
                               AT      39
MongoDB - Count

<?php
/* SELECT count(*) FROM table */
$collection->count();

/* SELECT count(*) FROM table WHERE foo = 1 */
$collection->find(array("foo" => 1))->count();

?>




@crodas - http://crodas.org/ - L EX
                               AT                       40
MongoDB - Queries
<?php
/*
 * SELECT * FROM table WHERE field IN (5,6,7) and enable=1
 * and worth < 5
 * ORDER BY timestamp DESC
 */

$collection->ensureIndex(
   array('field'=>1, 'enable'=>1, 'worth'=>1, 'timestamp'=>-1)
);

$filter = array(
       'field' => array('$in' => array(5,6,7),
       'enable' => 1,
       'worth' => array('$lt' => 5)
    );
$results = $collection->find($filter)->sort(array('timestamp' => -1));



@crodas - http://crodas.org/ - L EX
                               AT                                        41
MongoDB - Pagination
<?php
/*
 * SELECT * FROM table WHERE field IN (5,6,7) and enable=1
 * and worth < 5
 * ORDER BY timestamp DESC LIMIT $offset, 20
 */
$filter = array(
       'field' => array('$in' => array(5,6,7),
       'enable' => 1,
       'worth' => array('$lt' => 5)
    );

$cursor = $collection->find($filter);
$cursor = $cursor->sort(array('timestamp' => -1))->skip($offset)->limit(20);

foreach ($cursor as $result) {
   var dump($result);
}


@crodas - http://crodas.org/ - L EX
                               AT                                              42
Thinking in documents




@crodas - http://crodas.org/ - L EX
                               AT       43
@crodas - http://crodas.org/ - L EX
                               AT     44
MongoDB - Data structure
<?php
$post = array(
   "title" => "...",
   "body" => "...",
   "uri" => "...",
   "comments" => array(
          array(
             "email" => "...",
             "name" => "...",
             "comment" => "...",
          ),
   ),
   "tags" => array("tag1", "tag2"),
);
/* Creating indexes (they're important) */
$collection->ensureIndex("uri");
$collection->ensureIndex("comments.email");
$collection->ensureIndex("tags");


@crodas - http://crodas.org/ - L EX
                               AT                      45
MongoDB - Data structure
<?php
/***
 * - SELECT * FROM posts WHERE uri = <uri>
 * - SELECT tags.tag FROM post has tags
 *     INNER JOIN tags ON (tags id == tags.id) WHERE post id = <post id>
 * - SELECT * FROM comments WHERE post = <post id>
 */

$result = $collection->find(array("uri" => "<uri>"));

?>




@crodas - http://crodas.org/ - L EX
                               AT                                          46
MongoDB
<?php
/***
 * SELECT posts.* FROM posts INNER
 * JOIN comments ON (comments.post = posts.id)
 * WHERE comments.email = '<email>'
 *
 */

$filter = array(
    "comments.email" => 'crodas@member.fsf.org',
);

$result = $collection->find($filter);

?>




@crodas - http://crodas.org/ - L EX
                               AT                  47
MongoDB
<?php
/***
 * SELECT * FROM posts
 * WHERE id IN (SELECT posts id FROM posts has tags
 * INNER JOIN tags ON (tags id == tags.id) WHERE tag = <tag>)
 *
 */

$filter = array(
    "tags" => '<tag>',
);

$result = $collection->find($filter);

?>




@crodas - http://crodas.org/ - L EX
                               AT                               48
MongoDB
<?php
/***
 * SELECT * FROM posts WHERE id IN (
 * SELECT post FROM comments GROUP
 * BY post HAVING count(*) > 10)
 */

$filter = array(
    "comments" => array('$size' => array('$gt' => 10))
);

$result = $collection->find($filter);

?>




@crodas - http://crodas.org/ - L EX
                               AT                        49
MongoDB
<?php
/***
 * SELECT * FROM posts WHERE 10 < (
 * SELECT count(*) FROM comments
 * post = posts.id)
 */
/* on insert a comment */
$collection->update(
    array("uri" => "uri"), // select
    array('$inc' => array('comments size'=>1)) //increment
);

$filter = array(
    "comments size" => array('$gt' => 10)
);

$result = $collection->find($filter);



@crodas - http://crodas.org/ - L EX
                               AT                            50
Map/Reduce
                                         Extra time




@crodas - http://crodas.org/ - L EX
                               AT                     51
Map/Reduce -- Theory
<?php

for($i=0; $i < 50; $i++) {
   $result[$i] = pow($i, 2);
}

var dump($result);

/***
 * IF pow takes 1 second
 * 1 process = 50 seconds
 * 10 process = 5 seconds
 */

?>




@crodas - http://crodas.org/ - L EX
                               AT                       52
Map/Reduce -- Theory II
<?php

$data = range(1, 1000);

/* MAP */
foreach ($data as $key => $value) {
   $n key = $value % 10;
   /* append */
   $tmp[$n key][] = $value;
}

/* REDUCE */
foreach ($tmp as $key => $value) {
   $value = array sum($value);
   print "{$key} = {$value}n";
}




@crodas - http://crodas.org/ - L EX
                               AT                        53
Questions?




@crodas - http://crodas.org/ - L EX
                               AT                  54
Thank you fellows!




@crodas - http://crodas.org/ - L EX
                               AT             55
@crodas

                                      crodas.org



@crodas - http://crodas.org/ - L EX
                               AT                  56
Powered by...




@crodas - http://crodas.org/ - L EX
                               AT                     57

Más contenido relacionado

La actualidad más candente

Groovy and Grails
Groovy and GrailsGroovy and Grails
Groovy and GrailsGiltTech
 
Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...
Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...
Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...Yandex
 
Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1medcl
 
A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to RedisCharles Anderson
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture ForumChristopher Spring
 
An Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL databaseAn Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL databaseAli MasudianPour
 
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...yaevents
 
Doctrine Project
Doctrine ProjectDoctrine Project
Doctrine ProjectDaniel Lima
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisKnoldus Inc.
 
Redis/Lessons learned
Redis/Lessons learnedRedis/Lessons learned
Redis/Lessons learnedTit Petric
 
Building Awesome CLI apps in Go
Building Awesome CLI apps in GoBuilding Awesome CLI apps in Go
Building Awesome CLI apps in GoSteven Francia
 
第一回MongoDBソースコードリーディング
第一回MongoDBソースコードリーディング第一回MongoDBソースコードリーディング
第一回MongoDBソースコードリーディングnobu_k
 
XtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance AlgorithmsXtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance AlgorithmsLaurynas Biveinis
 
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin StożekJDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin StożekPROIDEA
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD VivaAidan Hogan
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHPfwso
 

La actualidad más candente (20)

Groovy and Grails
Groovy and GrailsGroovy and Grails
Groovy and Grails
 
Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...
Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...
Андрей Годин - Базы данных: Документоориентированная горизонтально масштабиру...
 
Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1Elastic Search Training#1 (brief tutorial)-ESCC#1
Elastic Search Training#1 (brief tutorial)-ESCC#1
 
A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to Redis
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 
An Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL databaseAn Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL database
 
Redis introduction
Redis introductionRedis introduction
Redis introduction
 
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...Распределенные системы хранения данных, особенности реализации DHT в проекте ...
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
 
Doctrine Project
Doctrine ProjectDoctrine Project
Doctrine Project
 
MongoDB & PHP
MongoDB & PHPMongoDB & PHP
MongoDB & PHP
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Redis/Lessons learned
Redis/Lessons learnedRedis/Lessons learned
Redis/Lessons learned
 
Building Awesome CLI apps in Go
Building Awesome CLI apps in GoBuilding Awesome CLI apps in Go
Building Awesome CLI apps in Go
 
第一回MongoDBソースコードリーディング
第一回MongoDBソースコードリーディング第一回MongoDBソースコードリーディング
第一回MongoDBソースコードリーディング
 
XtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance AlgorithmsXtraDB 5.6 and 5.7: Key Performance Algorithms
XtraDB 5.6 and 5.7: Key Performance Algorithms
 
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin StożekJDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
 
Scrapy.for.dummies
Scrapy.for.dummiesScrapy.for.dummies
Scrapy.for.dummies
 
Switch from shapefile
Switch from shapefileSwitch from shapefile
Switch from shapefile
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHP
 

Destacado

2008: Web Application Security Tutorial
2008: Web Application Security Tutorial2008: Web Application Security Tutorial
2008: Web Application Security TutorialNeil Matatall
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Cyber Crime and Security
Cyber Crime and SecurityCyber Crime and Security
Cyber Crime and SecurityDipesh Waghela
 
Cyber security
Cyber securityCyber security
Cyber securitySiblu28
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureArturo Pelayo
 

Destacado (9)

2008: Web Application Security Tutorial
2008: Web Application Security Tutorial2008: Web Application Security Tutorial
2008: Web Application Security Tutorial
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Email
EmailEmail
Email
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Cyber Crime and Security
Cyber Crime and SecurityCyber Crime and Security
Cyber Crime and Security
 
Cyber security
Cyber securityCyber security
Cyber security
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
 

Similar a Thinking in documents

MongoDB Advanced Topics
MongoDB Advanced TopicsMongoDB Advanced Topics
MongoDB Advanced TopicsCésar Rodas
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesDuyhai Doan
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkDvir Volk
 
NoSQL - Motivation and Overview
NoSQL - Motivation and OverviewNoSQL - Motivation and Overview
NoSQL - Motivation and OverviewJonathan Weiss
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsDr. Neil Brittliff
 
Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1Sid Anand
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesKurt Cagle
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisItamar Haber
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceDuyhai Doan
 
Tendências e Evoluções em Armazemamento de Dados
Tendências e Evoluções em Armazemamento de Dados Tendências e Evoluções em Armazemamento de Dados
Tendências e Evoluções em Armazemamento de Dados Jefferson Alcantara
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemAdam Marcus
 
The NoSQL Ecosystem
The NoSQL Ecosystem The NoSQL Ecosystem
The NoSQL Ecosystem yarapavan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...Duyhai Doan
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptSQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
 
TechEvent Apache Cassandra
TechEvent Apache CassandraTechEvent Apache Cassandra
TechEvent Apache CassandraTrivadis
 
GNW01: In-Memory Processing for Databases
GNW01: In-Memory Processing for DatabasesGNW01: In-Memory Processing for Databases
GNW01: In-Memory Processing for DatabasesTanel Poder
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
 

Similar a Thinking in documents (20)

MongoDB Advanced Topics
MongoDB Advanced TopicsMongoDB Advanced Topics
MongoDB Advanced Topics
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
 
Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
 
NoSQL - Motivation and Overview
NoSQL - Motivation and OverviewNoSQL - Motivation and Overview
NoSQL - Motivation and Overview
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your Analytics
 
Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1Linked in nosql_atnetflix_2012_v1
Linked in nosql_atnetflix_2012_v1
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data Frames
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
 
Tendências e Evoluções em Armazemamento de Dados
Tendências e Evoluções em Armazemamento de Dados Tendências e Evoluções em Armazemamento de Dados
Tendências e Evoluções em Armazemamento de Dados
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
The NoSQL Ecosystem
The NoSQL Ecosystem The NoSQL Ecosystem
The NoSQL Ecosystem
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptSQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
 
TechEvent Apache Cassandra
TechEvent Apache CassandraTechEvent Apache Cassandra
TechEvent Apache Cassandra
 
GNW01: In-Memory Processing for Databases
GNW01: In-Memory Processing for DatabasesGNW01: In-Memory Processing for Databases
GNW01: In-Memory Processing for Databases
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Thinking in documents

  • 1. Thinking in Documents (dropping ACID) César D. Rodas crodas@member.fsf.org http://crodas.org/ PHP Conference 2009 Sâo Paulo, Brasil 1
  • 2. Who is this fellow? Paraguayan Part of the Google Summer of Code 2008 PHP Classes Innovation Award winner 2007, 2008 ... and some other few things @crodas - http://crodas.org/ - L EX AT 2
  • 3. Agenda How to scale The Web's major bottleneck NoSQL databases • Redis • Tokyo Cabinet • Cassandra • CouchDB • MongoDB Thinking in documents • Data behavior • Complex operations PHP Integration (The fun part!) Map/Reduce (Extra time) @crodas - http://crodas.org/ - L EX AT 3
  • 5. Increase computational power @crodas - http://crodas.org/ - L EX AT 5
  • 6. To make it reliable @crodas - http://crodas.org/ - L EX AT 6
  • 8. How to scale Buying more hardware (and connectivity) Reverses (threaded) proxies DNS round robin for your Reverses proxies Gearmand Memcached and.. What about the data? @crodas - http://crodas.org/ - L EX AT 8
  • 9. How to scale data? @crodas - http://crodas.org/ - L EX AT 9
  • 10. The hardest way @crodas - http://crodas.org/ - L EX AT 10
  • 11. Scaling RDBMS - Solutions Master - Slave replication Multi-Master replication Data sharding DRDB and Heartbeat (RAID-1 over the network) @crodas - http://crodas.org/ - L EX AT 11
  • 13. Master-Slave replication We need to modify our app It worth only if our application is read intense It doesn't spread the data across servers Single point of failure @crodas - http://crodas.org/ - L EX AT 13
  • 14. Scaling RDBMS - Problems SQL JOIN Autoincrement Transactions (ACID) @crodas - http://crodas.org/ - L EX AT 14
  • 15. The easiest way @crodas - http://crodas.org/ - L EX AT 15
  • 16. Strong Consistency, High Availability, Partition-tolerance Theorem @crodas - http://crodas.org/ - L EX AT 16
  • 17. BASE Basically Available, Soft state, Eventually Consistent @crodas - http://crodas.org/ - L EX AT 17
  • 18. Everybody is doing it Google Amazon eBay Yahoo! Facebook ... @crodas - http://crodas.org/ - L EX AT 18
  • 19. Open implementations Cassandra Redis Tokyo Cabinet/Tyrant CouchDB MongoDB (FTW!) ... @crodas - http://crodas.org/ - L EX AT 19
  • 20. Cassandra No master (p2p) Storage model more like BigTable Open source Incremental scalable PHP interface (with Thrift) Never played too much with it. @crodas - http://crodas.org/ - L EX AT 20
  • 22. Key-value Fast Similar to PHP's array Simple Easy to distribute across machines @crodas - http://crodas.org/ - L EX AT 22
  • 23. Memcached It is a key-value store engine used as a cache. No persistence(RAM, uses LRU) Lightening fast Well supported *Everybody* is using it Several clients for PHP [even I had wrote one ;-)] @crodas - http://crodas.org/ - L EX AT 23
  • 24. Redis Very new As fast as Memcached Persistent to disk Very simple protocol Support lists and tuples Replication Operation in the key space I loved it! • Until I realised it is in-memory DB @crodas - http://crodas.org/ - L EX AT 24
  • 25. Tokyo Tyrant Very similar to BerkeleyDB ( dba open() ) Performs well (I've been playing a bit with it) Actively developed HTTP Interface (+/-) Memcached Protocol (++) Going to Document-oriented (supports "tables") @crodas - http://crodas.org/ - L EX AT 25
  • 26. Document-oriented DB @crodas - http://crodas.org/ - L EX AT 26
  • 28. What is a "Document"? <?php $collection[$id] = array( "title" => "PHP rules", "tags" => array("php", "web"), "body" => "... PHP rules ...", "comments" => array( array("author" => "crodas", "comment" => "Yes it does"), ) ); ?> @crodas - http://crodas.org/ - L EX AT 28
  • 29. Docuement Databases Schema free Document versioning Improved Key-value store Great for storing objects @crodas - http://crodas.org/ - L EX AT 29
  • 31. CouchDB Apache project Asynchronous replication JSON-based (XML free!) RESTful interface (might be bad) Views are materialized on demand (not Indexes :-( ) Cool admin Safe IO (Append only) Distributed (concurrent) by nature (written in Erlang) @crodas - http://crodas.org/ - L EX AT 31
  • 34. MongoDB Forgot about its name meaning in Portuguese. Fast, Fast, Fast JSON and BSON (Binary JSON-ish) Asynchronous replication, autosharding Support indexes (FTW!) Nested documents (FTW!) Advanced queries (FTW!) Native extension for PHP @crodas - http://crodas.org/ - L EX AT 34
  • 35. MongoDB - Advanced Select • $gt, $lt, $gte, $lte, $eq, $neq: >, <, >=, <=, ==, != • $in, $nin • $size, $exists • group() • limit() • skip() • ... Update • $push • $pull • $inc • ... @crodas - http://crodas.org/ - L EX AT 35
  • 36. pecl install mongo @crodas - http://crodas.org/ - L EX AT 36
  • 37. MongoDB - Connection <?php /* connects to localhost:27017 */ $connection = new Mongo(); /* connect to a remote host (default port) */ $connection = new Mongo( "example.com" ); /* connect to a remote host at a given port */ $connection = new Mongo( "example.com:65432" ); /* select some DB (and create if it doesn't exits yet) */ $db = $connection->selectDB("db name"); ?> @crodas - http://crodas.org/ - L EX AT 37
  • 38. MongoDB - "Tables" <?php $db = $connection->selectDB("db name"); $table = $db->getCollection("table"); ?> @crodas - http://crodas.org/ - L EX AT 38
  • 39. FROM SQL to MongoDB @crodas - http://crodas.org/ - L EX AT 39
  • 40. MongoDB - Count <?php /* SELECT count(*) FROM table */ $collection->count(); /* SELECT count(*) FROM table WHERE foo = 1 */ $collection->find(array("foo" => 1))->count(); ?> @crodas - http://crodas.org/ - L EX AT 40
  • 41. MongoDB - Queries <?php /* * SELECT * FROM table WHERE field IN (5,6,7) and enable=1 * and worth < 5 * ORDER BY timestamp DESC */ $collection->ensureIndex( array('field'=>1, 'enable'=>1, 'worth'=>1, 'timestamp'=>-1) ); $filter = array( 'field' => array('$in' => array(5,6,7), 'enable' => 1, 'worth' => array('$lt' => 5) ); $results = $collection->find($filter)->sort(array('timestamp' => -1)); @crodas - http://crodas.org/ - L EX AT 41
  • 42. MongoDB - Pagination <?php /* * SELECT * FROM table WHERE field IN (5,6,7) and enable=1 * and worth < 5 * ORDER BY timestamp DESC LIMIT $offset, 20 */ $filter = array( 'field' => array('$in' => array(5,6,7), 'enable' => 1, 'worth' => array('$lt' => 5) ); $cursor = $collection->find($filter); $cursor = $cursor->sort(array('timestamp' => -1))->skip($offset)->limit(20); foreach ($cursor as $result) { var dump($result); } @crodas - http://crodas.org/ - L EX AT 42
  • 43. Thinking in documents @crodas - http://crodas.org/ - L EX AT 43
  • 45. MongoDB - Data structure <?php $post = array( "title" => "...", "body" => "...", "uri" => "...", "comments" => array( array( "email" => "...", "name" => "...", "comment" => "...", ), ), "tags" => array("tag1", "tag2"), ); /* Creating indexes (they're important) */ $collection->ensureIndex("uri"); $collection->ensureIndex("comments.email"); $collection->ensureIndex("tags"); @crodas - http://crodas.org/ - L EX AT 45
  • 46. MongoDB - Data structure <?php /*** * - SELECT * FROM posts WHERE uri = <uri> * - SELECT tags.tag FROM post has tags * INNER JOIN tags ON (tags id == tags.id) WHERE post id = <post id> * - SELECT * FROM comments WHERE post = <post id> */ $result = $collection->find(array("uri" => "<uri>")); ?> @crodas - http://crodas.org/ - L EX AT 46
  • 47. MongoDB <?php /*** * SELECT posts.* FROM posts INNER * JOIN comments ON (comments.post = posts.id) * WHERE comments.email = '<email>' * */ $filter = array( "comments.email" => 'crodas@member.fsf.org', ); $result = $collection->find($filter); ?> @crodas - http://crodas.org/ - L EX AT 47
  • 48. MongoDB <?php /*** * SELECT * FROM posts * WHERE id IN (SELECT posts id FROM posts has tags * INNER JOIN tags ON (tags id == tags.id) WHERE tag = <tag>) * */ $filter = array( "tags" => '<tag>', ); $result = $collection->find($filter); ?> @crodas - http://crodas.org/ - L EX AT 48
  • 49. MongoDB <?php /*** * SELECT * FROM posts WHERE id IN ( * SELECT post FROM comments GROUP * BY post HAVING count(*) > 10) */ $filter = array( "comments" => array('$size' => array('$gt' => 10)) ); $result = $collection->find($filter); ?> @crodas - http://crodas.org/ - L EX AT 49
  • 50. MongoDB <?php /*** * SELECT * FROM posts WHERE 10 < ( * SELECT count(*) FROM comments * post = posts.id) */ /* on insert a comment */ $collection->update( array("uri" => "uri"), // select array('$inc' => array('comments size'=>1)) //increment ); $filter = array( "comments size" => array('$gt' => 10) ); $result = $collection->find($filter); @crodas - http://crodas.org/ - L EX AT 50
  • 51. Map/Reduce Extra time @crodas - http://crodas.org/ - L EX AT 51
  • 52. Map/Reduce -- Theory <?php for($i=0; $i < 50; $i++) { $result[$i] = pow($i, 2); } var dump($result); /*** * IF pow takes 1 second * 1 process = 50 seconds * 10 process = 5 seconds */ ?> @crodas - http://crodas.org/ - L EX AT 52
  • 53. Map/Reduce -- Theory II <?php $data = range(1, 1000); /* MAP */ foreach ($data as $key => $value) { $n key = $value % 10; /* append */ $tmp[$n key][] = $value; } /* REDUCE */ foreach ($tmp as $key => $value) { $value = array sum($value); print "{$key} = {$value}n"; } @crodas - http://crodas.org/ - L EX AT 53
  • 55. Thank you fellows! @crodas - http://crodas.org/ - L EX AT 55
  • 56. @crodas crodas.org @crodas - http://crodas.org/ - L EX AT 56
  • 57. Powered by... @crodas - http://crodas.org/ - L EX AT 57