SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Gian Merlino, Matt Herman, Vivek Pasari, Samarth Jain
Druid Meetup
11/14/18 - Los Gatos, CA
@
Agenda
● Druid Deployment @ Netflix
● Scaling & Sketch Strings
● Druid Roadmap
● Q&A
Druid
Deployment &
Use Cases
@Netflix
Overview
● Druid in Netflix D/W
● Data ingestion
● Deploying Druid @ Netflix
● Use Cases
Netflix Data Warehouse Pipeline
Druid Ingestion
● Batch ingestion
● Druid hadoop indexer
● Input - Hive Text/Parquet tables
● S3 deep storage
Druid Ingestion
Druid Ingestion
import BigDataApi as bda
tbl = bda.Table("hive/table_name")
# build spec
spec = bda.druid.DruidSpec.from_table(tbl)
.spec(ingestion_spec.json)
# create a job from the spec
job = bda.genie.DruidIndexerJob(spec)
.cluster(kg.druid.clusters.DRUID_CLUSTER_NAME)
# submit the job…
job.execute()
Druid Cluster @ Netflix
● r 4.16 x large instance type
● 0.12.2 version
● ~100s nodes
Multitenancy
● Single Tier
● Router
○ Ad hoc
○ Experimental - broker downtime acceptable. Used
for query fine tuning etc.
○ Reporting - pre-defined queries /dashboards
Autoscale
● Favor segments in memory
● Autoscale up - cluster disk utilization beyond 80%
● Handle large data ingestion without having to worry
about cluster tripping over
Deployment Pipeline
● Spinnaker (https://www.spinnaker.io/)
● Clusters upgraded using red black
○ Jenkins jobs - druid tar ball and debian package
○ Deploy components with new code line
○ Wait for segments to load
○ Switch dns records
○ Scale down old cluster
● Rollback
○ Switch dns back to old cluster
Deployment Pipeline
Use Cases
● Dashboard backend
● Sub second query times
○ User interactive slice and dice
○ Longer data retention vs Redshift
○ More dimensions vs Redshift
● Custom UI
AWS Capacity Planning
AWS Capacity Planning
AWS Capacity Planning
Other use cases
● Payments analysis
● Algorithms comparison
● Security
● Quality of Experience (QoE)
Future work
● Real time ingestion
○ Tranquility or Kafka indexing
● Open source T-Digest based Histogram module
● Investigate tiering
● Change auto-scaling policy considering EBS
Scaling & Sketch
Strings
How Netflix Processes 160B
Daily Customer Actions to
Monitor Client Performance
#netflixeverywhere
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
● 160 Billion client side
data points daily
● 135+ million members
● 190 countries
● 300 million devices
● 4 major UI platforms
TVUI, Web, iOS,
Android
Measure Everything Consistently
Client Performance
● Metrics
○ App launch times
○ Play delay
○ Details page time
● 29 dimensions
○ Geo
○ Network
○ Device
○ AB test cell
Client Performance Metrics
Architecture
Show Me
The Data
Summarize Instead
Anscombe’s Quartet
Saved by Sketch Strings
Box & Whisker Plots
Median application load times are similar
Country B has a larger IQR and long tail
Cumulative Distribution Functions
Recap
● Ingesting consistent and
highly dimensional data
● Analyzing data via custom
web visualizations
● Summarizing responsibly via
sketch strings
● Druid helps us provide the
best customer experience
Druid Roadmap
roadmap and community update
Gian Merlino
gian@imply.io
Who am I?
Gian Merlino
Committer & PMC member on
Cofounder at
>10 years working on scalable systems
40
Druid 0.13.0
…and beyond!!
Druid 0.13.0
400 new features and bug fixes from 81 contributors!
42
Druid 0.13.0
Our first Apache release!
(After years as an independent project.)
43
Druid 0.13.0
● Native parallel batch indexing (phase 1)
● Automatic compaction (phase 1)
● Ingestion statistics and errors via API
● SQL system tables: segments, tasks, servers
● SQL standard-compliant null handling option
● Additional aggregators (stringFirst/stringLast, new HllSketch)
● Support for multiple grouping specs in groupBy query
● Backpressure, compact result formats for large result sets
44
…and beyond!!
● Native parallel batch indexing (phase 2)
● Automatic compaction (phase 2)
● Smaller, faster compression (FastPFOR, etc)
● Faster quantiles: Fixed-bin histograms, moments sketches
● Dynamic prioritization
● Simpler, self-configuring deployment
● … your item here!!
45
Try this at home
46
Download
Druid community site (current): http://druid.io/
Druid community site (new): https://druid.apache.org/
Imply distribution: https://imply.io/get-started
47
Contribute
48
https://github.com/apache/druid
Stay in touch
49
@druidio
http://druid.io/community
Q&A

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospect
 
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
 
Big Data Applications
Big Data ApplicationsBig Data Applications
Big Data Applications
 
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
 
Building a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineBuilding a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection Engine
 
Elastic Stack Roadmap
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack Roadmap
 
Redis for Fast Data Ingest
Redis for Fast Data IngestRedis for Fast Data Ingest
Redis for Fast Data Ingest
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud ready
 
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESSARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
 
RedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
RedisConf17 - Real-time Intelligence with Redis-ML and Apache SparkRedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
RedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
Redis Day TLV 2018 - Redis & BioCatch
Redis Day TLV 2018 - Redis & BioCatchRedis Day TLV 2018 - Redis & BioCatch
Redis Day TLV 2018 - Redis & BioCatch
 
Taking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesTaking Your Database Global with Kubernetes
Taking Your Database Global with Kubernetes
 
Virtual training intro to InfluxDB - June 2021
Virtual training  intro to InfluxDB  - June 2021Virtual training  intro to InfluxDB  - June 2021
Virtual training intro to InfluxDB - June 2021
 

Similar a Druid meetup @ Netflix (11/14/2018 )

node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute Engine
Arun Nagarajan
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
Srinath Perera
 

Similar a Druid meetup @ Netflix (11/14/2018 ) (20)

Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1
 
Devoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterDevoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipster
 
Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute Engine
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at Netflix
 
Big data in action
Big data in actionBig data in action
Big data in action
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Setting up InfluxData for IoT
Setting up InfluxData for IoTSetting up InfluxData for IoT
Setting up InfluxData for IoT
 
DockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General Session
 
Data-Driven @ Netflix
Data-Driven @ NetflixData-Driven @ Netflix
Data-Driven @ Netflix
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open Source
 
Google Cloud - Scale With A Smile (Dec 2014)
Google Cloud - Scale With A Smile (Dec 2014)Google Cloud - Scale With A Smile (Dec 2014)
Google Cloud - Scale With A Smile (Dec 2014)
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
[Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy][Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy]
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
JHipster Code 2020 keynote
JHipster Code 2020 keynoteJHipster Code 2020 keynote
JHipster Code 2020 keynote
 

Último

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
HenryBriggs2
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
Health
 

Último (20)

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 

Druid meetup @ Netflix (11/14/2018 )

  • 1. Gian Merlino, Matt Herman, Vivek Pasari, Samarth Jain Druid Meetup 11/14/18 - Los Gatos, CA @
  • 2. Agenda ● Druid Deployment @ Netflix ● Scaling & Sketch Strings ● Druid Roadmap ● Q&A
  • 4. Overview ● Druid in Netflix D/W ● Data ingestion ● Deploying Druid @ Netflix ● Use Cases
  • 6.
  • 7. Druid Ingestion ● Batch ingestion ● Druid hadoop indexer ● Input - Hive Text/Parquet tables ● S3 deep storage
  • 9. Druid Ingestion import BigDataApi as bda tbl = bda.Table("hive/table_name") # build spec spec = bda.druid.DruidSpec.from_table(tbl) .spec(ingestion_spec.json) # create a job from the spec job = bda.genie.DruidIndexerJob(spec) .cluster(kg.druid.clusters.DRUID_CLUSTER_NAME) # submit the job… job.execute()
  • 10. Druid Cluster @ Netflix ● r 4.16 x large instance type ● 0.12.2 version ● ~100s nodes
  • 11. Multitenancy ● Single Tier ● Router ○ Ad hoc ○ Experimental - broker downtime acceptable. Used for query fine tuning etc. ○ Reporting - pre-defined queries /dashboards
  • 12. Autoscale ● Favor segments in memory ● Autoscale up - cluster disk utilization beyond 80% ● Handle large data ingestion without having to worry about cluster tripping over
  • 13. Deployment Pipeline ● Spinnaker (https://www.spinnaker.io/) ● Clusters upgraded using red black ○ Jenkins jobs - druid tar ball and debian package ○ Deploy components with new code line ○ Wait for segments to load ○ Switch dns records ○ Scale down old cluster ● Rollback ○ Switch dns back to old cluster
  • 15. Use Cases ● Dashboard backend ● Sub second query times ○ User interactive slice and dice ○ Longer data retention vs Redshift ○ More dimensions vs Redshift ● Custom UI
  • 19.
  • 20.
  • 21.
  • 22. Other use cases ● Payments analysis ● Algorithms comparison ● Security ● Quality of Experience (QoE)
  • 23. Future work ● Real time ingestion ○ Tranquility or Kafka indexing ● Open source T-Digest based Histogram module ● Investigate tiering ● Change auto-scaling policy considering EBS
  • 24. Scaling & Sketch Strings How Netflix Processes 160B Daily Customer Actions to Monitor Client Performance
  • 26. “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.”
  • 27. ● 160 Billion client side data points daily ● 135+ million members ● 190 countries ● 300 million devices ● 4 major UI platforms TVUI, Web, iOS, Android Measure Everything Consistently
  • 29. ● Metrics ○ App launch times ○ Play delay ○ Details page time ● 29 dimensions ○ Geo ○ Network ○ Device ○ AB test cell Client Performance Metrics
  • 34. Saved by Sketch Strings
  • 35. Box & Whisker Plots Median application load times are similar Country B has a larger IQR and long tail
  • 37. Recap ● Ingesting consistent and highly dimensional data ● Analyzing data via custom web visualizations ● Summarizing responsibly via sketch strings ● Druid helps us provide the best customer experience
  • 39. roadmap and community update Gian Merlino gian@imply.io
  • 40. Who am I? Gian Merlino Committer & PMC member on Cofounder at >10 years working on scalable systems 40
  • 42. Druid 0.13.0 400 new features and bug fixes from 81 contributors! 42
  • 43. Druid 0.13.0 Our first Apache release! (After years as an independent project.) 43
  • 44. Druid 0.13.0 ● Native parallel batch indexing (phase 1) ● Automatic compaction (phase 1) ● Ingestion statistics and errors via API ● SQL system tables: segments, tasks, servers ● SQL standard-compliant null handling option ● Additional aggregators (stringFirst/stringLast, new HllSketch) ● Support for multiple grouping specs in groupBy query ● Backpressure, compact result formats for large result sets 44
  • 45. …and beyond!! ● Native parallel batch indexing (phase 2) ● Automatic compaction (phase 2) ● Smaller, faster compression (FastPFOR, etc) ● Faster quantiles: Fixed-bin histograms, moments sketches ● Dynamic prioritization ● Simpler, self-configuring deployment ● … your item here!! 45
  • 46. Try this at home 46
  • 47. Download Druid community site (current): http://druid.io/ Druid community site (new): https://druid.apache.org/ Imply distribution: https://imply.io/get-started 47
  • 50. Q&A