The Other Way of Doing Big Data: Declarative, Decoupled, Federated, Simple, and Resilient.
Also known as: How to Win at Scale and its Influence of People. Originally presented by Flip Kromer to the Research Board, http://www.researchboard.com/ June 2012
11. • Manage 100s of machines: architecture as code
• Contain system complexity: relentlessly decouple
• Maintain coherency: federated truth
• Manage true costs: optimize for people not machines
• Manage failure & change:resiliency engineering
12. The Other Way
Declarative, not Homogenous
Decoupled, not Standardized
Federated, not Centralized
Simple, not Performant
Resilient, not Reliable
38. Data Stores in Production
• HBase • MySQL
• ElasticSearch • Redis
• Cassandra • sqlite
• TokyoTyrant • whisper (graphite)
• SimpleDB • file system
• MongoDB • S3
39. Programs Used for This Talk
• Emacs • Skitch
• Keynote • finder
• Preview • flickr.com
• Chrome • google image search
• ruby (pry) • ssh
40. How’s my Batch Job Going?
• 1 x Job Status
• 1 x Counters & App Metrics
• N x Task Status
• M x Machine System Stats
• 1 x Cloud Status
• 1 x Chef Server
52. n^2 law of coupling
100 things 5 + 3 + 2 things
+ 2 (tax)
53. n^2 law of coupling
2500
+
900
+
400
+
400
=
10,000 things 4200 things
to go wrong to go wrong
54.
55. Infochimps.com 2011
text search
Planet of the
API acct'g
APIs
infochimps.com models
A/B testing
cloud
services
56. Infochimps.com 2012
datasets catalog API
API docs
text search
content
dashboards Planet of the
API acct'g
APIs
auth & payment
layout
console
models
A/B testing
blog
press cloud
services
collateral
57. Infochimps.com 2012
(infochimps)
icsexpl catalog API
(saas)
capuchin
elasticsrch
kanzi
beergoggls Planet of the
MongoDB
APIs
george george
alphamale
MySQL
redis
WPEngine
totem cloud
services
hubspot
58. this drawing fits in my head
datasets catalog API
this app fits in my head,
and my laptop
59. Infochimps.com 2012
(infochimps)
icsexpl catalog API
(saas)
capuchin
elasticsrch
kanzi
beergoggls Planet of the
MongoDB
APIs
george george
alphamale
MySQL
redis
WPEngine
totem cloud
services
hubspot
This is on a 15-person organization\nFederated, meaning the data is semantically disparate\n
\n
\n
people are walking around as if we used to have one kind of database and now we have two\nThe important fact isn’t that one of them is sharded \nThe important fact is that they’re proliferating -- and that’s a good thing.\n
Google, Facebook, Amazon had to solve the scalability problem\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Now I know this sounds like the lunacy of a ritalin-addled architecture astronaut spending too much time on StackOverflow. \n
Now I know this sounds like the lunacy of a ritalin-addled architecture astronaut spending too much time on StackOverflow. \n