SlideShare una empresa de Scribd logo
1 de 56
javier ramirez
@supercoco9
https://datawaki.com
Big Data analytics with
Nginx, Logstash, Redis,
Google BigQuery, and Neo4j
datawaki
moral of the story
you can do big,
if you know how
javier ramirez @supercoco9 https://datawaki.com
Apache Hadoop
Apache Cassandra
Apache Spark
Apache Storm
Hbase
Kafka
javier ramirez @supercoco9 https://datawaki.com
bigdata is cool but...
expensive cluster
hard to set up and monitor
not interactive enough
Data analysis as a service
Google BigQuery
javier ramirez @supercoco9 https://datawaki.com
javier ramirez @supercoco9 https://datawaki.com
The right-now data analytics platform
for your website, your backend, and your business
datawaki
The Challenge
Several thousands of req./s
From many devices/apps
Provide real-time alerts
Analyze billions of rows interactively
Extract graph information
javier ramirez @supercoco9 https://datawaki.com
The real challenge
Cheap
javier ramirez @supercoco9 https://datawaki.com
data from many sources
HTTP
Libraries available for virtually any
programming language
De facto standard for inter-system comms.
Easy to script from command line tools
Free, open-source, high-performance HTTP
server and reverse proxy
Nginx is known for its high performance,
stability, rich feature set, simple configuration,
and low resource consumption.
Used by Netflix, Hulu, Pinterest, CloudFlare, Airbnb, WordPress.com, GitHub,
SoundCloud, Zynga, Eventbrite, Zappos, Media Temple, Heroku, RightScale, Engine
Yard and MaxCDN
Free, open-source, high-performance HTTP
server and reverse proxy
Nginx is known for its high performance,
stability, rich feature set, simple configuration,
and low resource consumption.
Used by Netflix, Hulu, Pinterest, CloudFlare, Airbnb, WordPress.com, GitHub,
SoundCloud, Zynga, Eventbrite, Zappos, Media Temple, Heroku, RightScale, Engine
Yard and MaxCDN
Log
NGINX
Log
NGINX
Several hundred thousand
request per second/server
Limited by network
bandwidth
Just add more servers ($5)
and balance
data
input
Logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use
It is fully free and fully open source. The
license is Apache 2.0, meaning you are
pretty much free to use it however you
want in whatever way.
logstash: handle the log data
Highly scalable (Jruby process)
Input/Output/Codecs/Filters
Easily extendable using ruby
Logstash
Log
NGINX
Log
NGINX
Logstash
Data Verification: we
discard invalid inputs in
Logstash
We complete messsages
with basic info (timestamp,
origin...)
Redis
data
input
open source, BSD licensed, advanced
key-value store. It is often referred to as a
data structure server since keys can contain
strings, hashes, lists, sets and sorted sets.
http://redis.io
started in 2009 by Salvatore Sanfilippo @antirez
100+ contributors at
https://github.com/antirez/redis
javier ramirez @supercoco9 https://datawaki.com codemotion 2013
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (with pipelining)
$ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
SET: 552,028 requests per second
GET: 707,463 requests per second
LPUSH: 767,459 requests per second
LPOP: 770,119 requests per second
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (without pipelining)
$ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -q
SET: 122,556 requests per second
GET: 123,601 requests per second
LPUSH: 136,752 requests per second
LPOP: 132,424 requests per second
javier ramirez @supercoco9 https://datawaki.com codemotion 2013
Redis keeps
everything in
memory
all the time
javier ramirez @supercoco9 https://datawaki.com
what it's being used for
javier ramirez @supercoco9 https://datawaki.com
twitter
user info from
gizmoduck
(memcached)
user id tweet id metadata
write API (from browser or client app)
rpushx to Redis
tweet info from tweetypie
(memcached + mysql) your twitter
timeline
javier ramirez @supercoco9 https://datawaki.com
fanout (flockDB)
one per follower
products using Redis
javier ramirez @supercoco9 https://datawaki.com
Pinterest
SnapChat
World of Warcraft
GitHub
HipChat
SoundCloud
Tumblr
Booking.com
YouPorn...
Log
NGINX
Log
NGINX
Logstash
Redis
Ruby
Worker
Ruby
Worker
Alert
system
data
input
javier ramirez @supercoco9 https://datawaki.com
Google BigQuery
Data analysis as a service
http://developers.google.com/bigquery
javier ramirez @supercoco9 https://datawaki.com
Based on Dremel
Specifically designed for
interactive queries over
petabytes of real-time data
javier ramirez @supercoco9 https://datawaki.com
• Analysis of crawled web documents.
• Tracking install data for applications on Android Market.
• Crash reporting for Google products.
• OCR results from Google Books.
• Spam analysis.
• Debugging of map tiles on Google Maps.
• Tablet migrations in managed Bigtable instances.
• Results of tests run on Google’s distributed build system.
• Disk I/O statistics for hundreds of thousands of disks.
• Resource monitoring for jobs run in Google’s data centers.
• Symbols and dependencies in Google’s codebase.
What Dremel has been used for in
Google
INPUT
/
OUTPUT
Big Data's
#1 Enemy
INDEXES
Data
Scientists's
#1 Enemy
Columnar
storage
javier ramirez @supercoco9 https://datawaki.com
Colossus filesystem
Distributed/redundant
Parallel reads
Ultra fast network
highly distributed
execution using a tree
javier ramirez @supercoco9 https://datawaki.com
loading data
You can feed flat CSV-like
files or nested JSON objects
javier ramirez @supercoco9 https://datawaki.com
web console screenshot
javier ramirez @supercoco9 https://datawaki.com
javier ramirez @supercoco9 https://datawaki.com
analytical SQL functions.
correlations.
window functions.
views.
JSON fields.
timestamped tables.
Things you always wanted to
try but were too scared to
javier ramirez @supercoco9 https://datawaki.com
select count(*) from
publicdata:samples.wikipedia
where REGEXP_MATCH(title, "[0-9]*")
AND wp_namespace = 0;
223,163,387
Query complete (5.6s elapsed, 9.13 GB processed)
Global Database of Events,
Language and Tone
quarter billion rows
30 years
updated daily
http://gdeltproject.org/data.html#googlebigquery
SELECT Year, Actor1Name, Actor2Name, Count FROM (
SELECT Actor1Name, Actor2Name, Year,
COUNT(*) Count, RANK() OVER(PARTITION BY YEAR ORDER BY
Count DESC) rank
FROM
(SELECT Actor1Name, Actor2Name, Year FROM
[gdelt-bq:full.events] WHERE Actor1Name < Actor2Name
and Actor1CountryCode != '' and Actor2CountryCode != ''
and Actor1CountryCode!=Actor2CountryCode),
(SELECT Actor2Name Actor1Name, Actor1Name Actor2Name,
Year FROM [gdelt-bq:full.events] WHERE
Actor1Name > Actor2Name and Actor1CountryCode != '' and
Actor2CountryCode != '' and
Actor1CountryCode!=Actor2CountryCode),
WHERE Actor1Name IS NOT null
AND Actor2Name IS NOT null
GROUP EACH BY 1, 2, 3
HAVING Count > 100
)
WHERE rank=1
ORDER BY Year
BigQuery pricing
$20 per stored TB
1000000 rows => $0.004 / month
$5 per processed TB
1 full scan (1MM rows) ~ 200 MB
1 count = 0 MB
1 full scan over 1 column ~ 15 MB
*the 1st
TB every month is free of charge
javier ramirez @supercoco9 https://datawaki.com
Log
NGINX
Log
NGINX
Logstash
Redis
BigQuery
Ruby
Worker
Ruby
Worker
Alert
system
data
input
Neo4j is a high performance graph store with all the
features expected of a mature and robust database, like a
friendly query language and ACID transactions.
The programmer works with a flexible network structure of
nodes and relationships rather than static tables—yet   
enjoys all the benefits of enterprise-quality database.
For many applications, Neo4j offers orders of magnitude
performance benefits compared to relational DBs.
Define data flows (funnels) for users or devices
Check if the data points are part of a funnel
Store BigQuery ID on the graph so we can
cross analytical queries with data flows
How are we using neo4j
MATCH
startPath=(root)-[:`2010`]->()-[:`12`]->()-[:`31`]->
(startLeaf), endPath=(root)-[:`2011`]->()-[:`01`]->()
-[:`03`]->(endLeaf),
valuePath=(startLeaf)-[:NEXT*0..]->(middle)-
[:NEXT*0..]->(endLeaf), vals=(middle)-[:VALUE]->(event)
WHERE
root.name = 'Root'
RETURN
event.name
ORDER BY
event.name ASC
Cypher Query Language
Neo4J web console
Log
NGINX
Log
NGINX
Logstash
Redis
BigQuery
Neo4j
Ruby
Worker
Ruby
Worker
Alert
system
data
input
Postgre
SQL
Log
NGINX
Log
NGINX
Logstash
Redis
BigQuery
Neo4j
Ruby
Worker
Ruby
Worker
Rails
App
Alert
system
datawaki in a nutshell
Report
system
user
interaction
data
input
Cost of a minimum system
Nginx $5 per server
Logstash $10 per server
Redis $5
Ruby workers $5 per server
BigQuery $5 per 500MM rows
Neo4j $10 per server
Rails $5 per server
total: $45 / month + backups
javier ramirez @supercoco9 https://datawaki.com
ig
javier ramirez
@supercoco9
https://datawaki.com
Thanks!
datawaki

Más contenido relacionado

La actualidad más candente

Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)Paul Chao
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQLYu Ishikawa
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Big Data Expo 2015 - Gigaspaces Making Sense of it all
Big Data Expo 2015 - Gigaspaces Making Sense of it allBig Data Expo 2015 - Gigaspaces Making Sense of it all
Big Data Expo 2015 - Gigaspaces Making Sense of it allBigDataExpo
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...guest5b1607
 
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsGDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsPatrick Chanezon
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of StreamsFrom Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of StreamsMike Fowler
 
Hadoop Hand-on Lab: Installing Hadoop 2
Hadoop Hand-on Lab: Installing Hadoop 2Hadoop Hand-on Lab: Installing Hadoop 2
Hadoop Hand-on Lab: Installing Hadoop 2IMC Institute
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryMárton Kodok
 
Find out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHFind out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHVincent Terrasi
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
 
Google Dremel. Concept and Implementations.
Google Dremel. Concept and Implementations.Google Dremel. Concept and Implementations.
Google Dremel. Concept and Implementations.Vicente Orjales
 
Big Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop WorkshopBig Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop WorkshopIMC Institute
 
How Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses CassandraHow Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses Cassandragdusbabek
 
Druid Adoption Tips and Tricks
Druid Adoption Tips and TricksDruid Adoption Tips and Tricks
Druid Adoption Tips and TricksImply
 

La actualidad más candente (20)

Big Data made easy with a Spark
Big Data made easy with a SparkBig Data made easy with a Spark
Big Data made easy with a Spark
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Big Data Expo 2015 - Gigaspaces Making Sense of it all
Big Data Expo 2015 - Gigaspaces Making Sense of it allBig Data Expo 2015 - Gigaspaces Making Sense of it all
Big Data Expo 2015 - Gigaspaces Making Sense of it all
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
 
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsGDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of StreamsFrom Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
 
Logs & Visualizations at Twitter
Logs & Visualizations at TwitterLogs & Visualizations at Twitter
Logs & Visualizations at Twitter
 
Hadoop Hand-on Lab: Installing Hadoop 2
Hadoop Hand-on Lab: Installing Hadoop 2Hadoop Hand-on Lab: Installing Hadoop 2
Hadoop Hand-on Lab: Installing Hadoop 2
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
 
Find out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHFind out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVH
 
Elastic Stack Roadmap
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack Roadmap
 
GitConnect
GitConnectGitConnect
GitConnect
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
Google Dremel. Concept and Implementations.
Google Dremel. Concept and Implementations.Google Dremel. Concept and Implementations.
Google Dremel. Concept and Implementations.
 
Big Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop WorkshopBig Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop Workshop
 
How Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses CassandraHow Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses Cassandra
 
Druid Adoption Tips and Tricks
Druid Adoption Tips and TricksDruid Adoption Tips and Tricks
Druid Adoption Tips and Tricks
 

Similar a Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, javier ramirez, datawaki

Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...javier ramirez
 
API analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionAPI analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionjavier ramirez
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1Luca Garulli
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
api analytics redis bigquery. Lrug
api analytics redis bigquery. Lrugapi analytics redis bigquery. Lrug
api analytics redis bigquery. Lrugjavier ramirez
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Intro to big data analytics using microsoft machine learning server with spark
Intro to big data analytics using microsoft machine learning server with sparkIntro to big data analytics using microsoft machine learning server with spark
Intro to big data analytics using microsoft machine learning server with sparkAlex Zeltov
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperryanlecompte
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist SoftServe
 
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014javier ramirez
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azureDavid Giard
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBDenny Lee
 
StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural OverviewFolio3 Software
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Codemotion
 
Works with persistent graphs using OrientDB
Works with persistent graphs using OrientDB Works with persistent graphs using OrientDB
Works with persistent graphs using OrientDB graphdevroom
 
Big Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David GiardBig Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David GiardITCamp
 
What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsRedis Labs
 

Similar a Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, javier ramirez, datawaki (20)

Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
API analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionAPI analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters edition
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
api analytics redis bigquery. Lrug
api analytics redis bigquery. Lrugapi analytics redis bigquery. Lrug
api analytics redis bigquery. Lrug
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Intro to big data analytics using microsoft machine learning server with spark
Intro to big data analytics using microsoft machine learning server with sparkIntro to big data analytics using microsoft machine learning server with spark
Intro to big data analytics using microsoft machine learning server with spark
 
Ml2
Ml2Ml2
Ml2
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeper
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural Overview
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
 
Works with persistent graphs using OrientDB
Works with persistent graphs using OrientDB Works with persistent graphs using OrientDB
Works with persistent graphs using OrientDB
 
Big Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David GiardBig Data Solutions in Azure - David Giard
Big Data Solutions in Azure - David Giard
 
What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis Labs
 
Ibm db2 big sql
Ibm db2 big sqlIbm db2 big sql
Ibm db2 big sql
 

Más de javier ramirez

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfestjavier ramirez
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databasejavier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBjavier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragónjavier ramirez
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessjavier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloudjavier ramirez
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analyticsjavier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelinejavier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Divejavier ramirez
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)javier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSjavier ramirez
 

Más de javier ramirez (20)

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
 

Último

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Último (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, javier ramirez, datawaki

Notas del editor

  1. nadie duda de que tu api sea técnicamente muy buena, pero...
  2. conclusión obvia esto va a ser un problema de big data el problema es que nosotros no sabíamos de big data. Nos sonaba map/reduce, hadoop, cassandra.. pero nos faltaban datos
  3. master-slave, transactions, atomicity/concurrency Nosotros mantenemos una lista en la que insertamos una entrada por cada operación del API NEXT: funciona rápido porque va en memoria
  4. aquí no tuvimos que pensar mucho, porque estábamos usando ya redis para varias cosas en el sistema, justamente por este motivo de que permite muchas operaciones de forma muy ligera
  5. pero podemos configurar persistencia y redundancia intermediate storage cache index
  6. NEXT: twitter
  7. kary perry, 45MM 400 million tweets per day 4600 tweets per second 30 billion timeline deliveries per day 300K queries per second 7000 tweets per second at peak times 12000 tweets per second at events 143199/sec tweets on castle in the sky 800 tweets history per timeline on redis
  8. 44% of twitter accounts never posted a tweet each tweet Is replicated 3 times 2 terabytes of ram for the redis cluster kate perry 67 million inserts (justin bieber 56MM) t-bird are the tweets in mysql gizmoduck all users in memcached tweetypie all tweets for last 45 days
  9. snapchat 400MM daily youporn 200MM daily
  10. debido a que el protocolo de redis es muy sencillo, se puede acceder a redis desde cualquier lenguaje de programación, y como los servidores web soportan scripting... you can set the info directly from the webserver, so if you have several backends (rails, node...) you can centralize all your logging into a single layer
  11. Not for analytics! Everything on memory! SO far we have solved the Velocity part of big data, and a bit of the veracity, but we need more
  12. Apache Drill es el equivalente en open source. No funciona como servicio. bigquery es un recubrimiento REST encima de dremel. Usable desde cualquier plataforma que permita REST. Apis disponibles para diferentes lenguajes Solamente para inserciones! no borrados o updates.A menudo junto Map/reduce o hadoop. Análisis in place, sin carga previa, sin índices ni planificar las queries de antemano
  13. full scan!
  14. a typical Solid State Disk reads at 550MBytes/second
  15. The public enemy of data scientist/interactive queries
  16. Column data is of uniform type; therefore, there are some opportunities for storage size optimizations available in column-oriented data that are not available in row-oriented data. also less I/O Además Dremel proporciona una estructura en árbol para lanzar las queries
  17. batch y tiempo real tanto en la entrada de datos (ficheros o stream) como en la salida (interactivo o batch) pagas por lo que usas read only!!!
  18. batch y tiempo real tanto en la entrada de datos (ficheros o stream) como en la salida (interactivo o batch) pagas por lo que usas
  19. web console api rest command line Notice the validate button to avoid expenses
  20. next: full scan regexp
  21. total 313,797,035
  22. global database of events, language and tone quarter billion rows 30 years updated daily
  23. review!
  24. Open source
  25. Open source
  26. Open source
  27. You can combine services on a single server at first