3. Who Am I?
Member of Netflix’s Platform
Engineering team, working on
very large scale data
infrastructure (@g9yuayon)
Friday, March 1, 13 2
4. Who Am I?
Member of Netflix’s Platform
Engineering team, working on
very large scale data
infrastructure (@g9yuayon)
Built and operated Netflix’s
cloud crypto service
Friday, March 1, 13 2
5. Who Am I?
Member of Netflix’s Platform
Engineering team, working on
very large scale data
infrastructure (@g9yuayon)
Built and operated Netflix’s
cloud crypto service
Worked with Jae Bae on
querying multi-dimensional data
in real time
Friday, March 1, 13 2
6. Friday, March 1, 13 3
Developers usually think about monitoring metrics when “real-time” data is
mentioned. We have powerful monitoring systems that track millions of metrics
per second. But I’m not going to talk about it today. Monitoring metric is crucial
data. That itself would warrant another multi-hour talk by our monitoring
team. :-)
7. No Monitoring Metrics Today
Friday, March 1, 13 3
Developers usually think about monitoring metrics when “real-time” data is
mentioned. We have powerful monitoring systems that track millions of metrics
per second. But I’m not going to talk about it today. Monitoring metric is crucial
data. That itself would warrant another multi-hour talk by our monitoring
team. :-)
11. Server Farm
Log Filter Sink Plugin Hadoop
Server Farm Kafka
Log Filter Sink Plugin Druid
Log Collectors
Server Farm
Log Filter Sink Plugin ElasticSearch
photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/m/in/photostream/
Friday, March 1, 13 7
We have this tens of thousands of machines, all of which send log data over a robust data
pipeline to highly reliable data collectors. The collectors then filter the data, transform the
data, and dispatch the data to to different destinations for further processing.
Photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/m/in/
photostream/
12. Highly Reliable Data Pipeline
Server Farm
Log Filter Sink Plugin Hadoop
Server Farm Kafka
Log Filter Sink Plugin Druid
Log Collectors
Server Farm
Log Filter Sink Plugin ElasticSearch
photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/m/in/photostream/
Friday, March 1, 13 7
We have this tens of thousands of machines, all of which send log data over a robust data
pipeline to highly reliable data collectors. The collectors then filter the data, transform the
data, and dispatch the data to to different destinations for further processing.
Photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/m/in/
photostream/
13. A Humble Beginning
Friday, March 1, 13 8
We didn’t build everything in one night. Actually, we had a humble start. I did a lot of log
scraping like these. I also used R to analyze logs. But these are specific tasks, and at some
point
14. A Humble Beginning
Friday, March 1, 13 8
We didn’t build everything in one night. Actually, we had a humble start. I did a lot of log
scraping like these. I also used R to analyze logs. But these are specific tasks, and at some
point
15. A Humble Beginning
Friday, March 1, 13 8
We didn’t build everything in one night. Actually, we had a humble start. I did a lot of log
scraping like these. I also used R to analyze logs. But these are specific tasks, and at some
point
16. A Humble Beginning
Friday, March 1, 13 8
We didn’t build everything in one night. Actually, we had a humble start. I did a lot of log
scraping like these. I also used R to analyze logs. But these are specific tasks, and at some
point
17. Friday, March 1, 13 9
Something happened. Our traffic turned into a hockey stick, and the number of applications
exploded. So, log traffic also exploded. Simple log scraping wouldn’t cut it any more.
18. Friday, March 1, 13 9
Something happened. Our traffic turned into a hockey stick, and the number of applications
exploded. So, log traffic also exploded. Simple log scraping wouldn’t cut it any more.
19. Application
Application
Application
Application Application
Application
Application Application
Application Application
Friday, March 1, 13 9
Something happened. Our traffic turned into a hockey stick, and the number of applications
exploded. So, log traffic also exploded. Simple log scraping wouldn’t cut it any more.
20. So We Evolved
Friday, March 1, 13 10
So we evolved. One thing we built was a hadoop grep. This tool searches TBs of data. It is
much more useful that the one provided by Apache Hadoop Distribution, because it supports
many more Grep options like context, sorting by columns, and etc. And DSE’s Hadoop-as-a-
service greatly helps each team.
21. So We Evolved
Friday, March 1, 13 10
So we evolved. One thing we built was a hadoop grep. This tool searches TBs of data. It is
much more useful that the one provided by Apache Hadoop Distribution, because it supports
many more Grep options like context, sorting by columns, and etc. And DSE’s Hadoop-as-a-
service greatly helps each team.
22. So We Evolved
hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
Friday, March 1, 13 10
So we evolved. One thing we built was a hadoop grep. This tool searches TBs of data. It is
much more useful that the one provided by Apache Hadoop Distribution, because it supports
many more Grep options like context, sorting by columns, and etc. And DSE’s Hadoop-as-a-
service greatly helps each team.
23. So We Evolved
hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
Friday, March 1, 13 10
So we evolved. One thing we built was a hadoop grep. This tool searches TBs of data. It is
much more useful that the one provided by Apache Hadoop Distribution, because it supports
many more Grep options like context, sorting by columns, and etc. And DSE’s Hadoop-as-a-
service greatly helps each team.
24. Friday, March 1, 13 11
A search tool that searches live instances’ logs is also developed.
25. Friday, March 1, 13 11
A search tool that searches live instances’ logs is also developed.
26. Friday, March 1, 13 11
A search tool that searches live instances’ logs is also developed.
27. Friday, March 1, 13 11
A search tool that searches live instances’ logs is also developed.
28. Friday, March 1, 13 11
A search tool that searches live instances’ logs is also developed.
29. Friday, March 1, 13 11
A search tool that searches live instances’ logs is also developed.
30. Field Name Field Value
Client “API”
Server “Cryptex”
StatusCode 200
ResponseTime 73
Friday, March 1, 13 12
Hive becomes indispensable.
34. Friday, March 1, 13 14
So we built yet another tool to scratch it with the help of Druid.
35. Still, We Have a Real-Time Itch
Friday, March 1, 13 14
So we built yet another tool to scratch it with the help of Druid.
36. Friday, March 1, 13 15
Error summary in the past 10 seconds. You get to slice and dice through arbitrary
combination of different dimension across multiple time series.
Trends over search query of “90210” by Canadians
How many people started streaming any episode of House of Cards in the past hour, grouped
37. Friday, March 1, 13 15
Error summary in the past 10 seconds. You get to slice and dice through arbitrary
combination of different dimension across multiple time series.
Trends over search query of “90210” by Canadians
How many people started streaming any episode of House of Cards in the past hour, grouped
38. Friday, March 1, 13 15
Error summary in the past 10 seconds. You get to slice and dice through arbitrary
combination of different dimension across multiple time series.
Trends over search query of “90210” by Canadians
How many people started streaming any episode of House of Cards in the past hour, grouped
39. Friday, March 1, 13 16
A query of all the users who started streaming House of Cards in the past three hours, and
results came back in seconds.
40. Friday, March 1, 13 16
A query of all the users who started streaming House of Cards in the past three hours, and
results came back in seconds.
41. Friday, March 1, 13 16
A query of all the users who started streaming House of Cards in the past three hours, and
results came back in seconds.
43. See You
Tomorrow
Friday, March 1, 13 18
If you’re interested in how we did the real-time interactive queries with the help of Druid, do
come to our talk. See you tomorrow