SlideShare una empresa de Scribd logo
1 de 21
Realtime Classroom Analytics
Powered By Apache Druid
Karthik Deivasigamani, Chief Architect, Noon - The Social Learning Platform
Agenda
● Who We Are
● Live Online Classroom
● Quality Of Experience
● Why Apache Druid
● Realtime Classroom Monitoring
● Key Lessons
● Q & A
Who Are We?
Noon has evolved into a ‘Social Learning’
platform three years ago to craft the most
engaging learning experience.
● Our mission is to radically change the
way people learn.
● Make learning more social and fun.
● 10M+ users from over 5 countries
● 1M+ MAU with 50+ mins per active day
per student
Live Online Classroom
Students spend a significant amount of their
time on Noon learning from their teacher
within the online classrooms.
Classroom Features
● Video, Audio, Chat and Whiteboard
● Breakouts, Raise Hand
● Peak 10K students / session
Live Classroom - Challenges
Audio
Voice is broken
● Teacher’s uplink quality
● Issues with microphone
● Student’s downlink
quality
● ISP policies
Whiteboard
Lag in whiteboard
● Loss of drawing events
due to unstable network
● Heavy CPU usage on the
mobile device
● Software Bug
Quality Of Experience
“Quality of experience is a measure
of the delight or annoyance of a
customer's experiences with a
service.” - Wikipedia
Monitoring The Classroom
Metrics
● Uplink/Downlink Network Quality
● Packet Loss
● Remote/Local Audio Quality
● Mic Status
● Jitter Buffer Delay
● frameFrozenRate
● Uplink/Downlink BitRate
Dimensions
● Country
● Region
● City
● Session
● User
● ISP
● Network Type
Aggregations
● Percentile
● Count
● Average
● Distinct Count
● Standard Deviation
System Characteristics
● Real Time Ingestion
● Scale Horizontally
● High Cardinality Data
● Subsecond Query Latency
● Fast Aggregation
● Zoom In & Zoom Out
● Highly Available
Why Apache Druid
● Real Time Ingestion From Kafka Through Spec Files
● Data & Query Nodes Allows For Horizontal Scaling
● Sketches For High Cardinality Columns
● Low-Latency Querying
● Rich Built In Capabilities For Exact & Approx Aggregation
● Data Rollups
● Fault Tolerance At Multiple Levels
Data Collection - Network & Audio
WebRTC Stats
Sent BitRate
Received BitRate
Audio Packet Loss
Audio Level
Bytes Sent/Received
Audio Frame Freeze Rate
Network Quality
Audio Quality
Data Collection - Whiteboard
Whiteboard Stats
Stroke Difference
Drift Percentage
Ingestion
● All ingestions happen via Kafka in real
time
● Flink Topology
● Split & Format to conform with
ingestion spec
● Rollup Enabled At Ingestion Time
● Conditional transformation
● Looking forward to using Lag Based
AutoScaler.
Making Ingestion Easy
● Well defined event (ProtoBuf) schema
serialized as JSON.
● Jsonpath based DSL defining
transformers & ingestion spec.
● Parsing & Transformation based on
the configuration file in a flink
topology.
● Ingestion Spec Auto Generated from
JSON configuration file.
● Automated Deployments Via Jenkins
Schema Design
● Always start from your use-cases.
● Identify Dimensions & Metrics
● Aggregations & Approximation (hyperloglog,
quantiles sketches)
● Query Granularity
● Partitions
● Deep Storage
● Data Retention
Self Serve Dashboard - Zoom Out & Zoom In
Country Level View
Sessions Inside A
Country
Session Level View
Students Inside A
Session View
Student Session
Level View
Our Druid Cluster
Topology
● Master (m5.2xl)
● Data Node (i3.2xl)
○ Tiered
○ 24 slots
● Query Node (m5.2xl)
● External ZK, MySQL, S3
Deep Storage
Monitoring Numbers
● Datadog-Druid
● System Resources
● Ingestion Lag
● Number of Segments
● Query Time
● JVM Memory Usage
● 15+ dims, 50+ metrics
● 105 M events per day
● 2B rows @ Avg Row Size
1K
● 4k-5k Segment
● p90 latency ~ 850 ms
Putting Together
Business Impact
● Quickly Identify Problems
● Validation of fixes put in to improve quality
● Self Serve Tool, reducing burden on
developers
● Improved transparency & trust between
OPS and developers
● Student NPS score improved
Challenges & Key Lessons
● Rollups are your best friend
● Ingestion Time Transformation > Query Time
Transformation
● Approximation - Hyperloglog, Data Sketches
● Late Arrival Of Messages & Compaction
● Query Performance depends on your data model
● Setup takes time to stabilize.
● druid-user group is super helpful!
Questions?
Thank you
Contact: karthik@noonacademy.com

Más contenido relacionado

La actualidad más candente

Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of dataconfluent
 
How to Actually Tune Your Spark Jobs So They Work
How to Actually Tune Your Spark Jobs So They WorkHow to Actually Tune Your Spark Jobs So They Work
How to Actually Tune Your Spark Jobs So They WorkIlya Ganelin
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com confluent
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdfRun Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdfAnya Bida
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019DataKitchen
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKKriangkrai Chaonithi
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDatabricks
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Max Lapan
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data PipelineManish Kumar
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaGuido Schmutz
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsLynn Langit
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyData Con LA
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systemsXavier Amatriain
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

La actualidad más candente (20)

Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
How to Actually Tune Your Spark Jobs So They Work
How to Actually Tune Your Spark Jobs So They WorkHow to Actually Tune Your Spark Jobs So They Work
How to Actually Tune Your Spark Jobs So They Work
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdfRun Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
Run Apache Spark on Kubernetes in Large Scale_ Challenges and Solutions-2.pdf
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OK
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik Ramasamy
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Similar a Realtime classroom analytics powered by apache druid

(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The CloudAmazon Web Services
 
Engineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudEngineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudJosh Evans
 
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, CriteoParis Open Source Summit
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithNETWAYS
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaSteven Wu
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsZhenxiao Luo
 
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases DistributedRedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases DistributedRedis Labs
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleDmytro Semenov
 
Instruments to play microservice
Instruments to play microserviceInstruments to play microservice
Instruments to play microserviceChandresh Pancholi
 
Design patterns for scaling web applications
Design patterns for scaling web applicationsDesign patterns for scaling web applications
Design patterns for scaling web applicationsIvan Dimitrov
 
Java Based RFID Attendance Management System Graduation Project Presentation
Java Based RFID Attendance Management System Graduation Project PresentationJava Based RFID Attendance Management System Graduation Project Presentation
Java Based RFID Attendance Management System Graduation Project PresentationIbrahim Abdel Fattah Mohamed
 
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...Precisely
 
Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@GrabShubham Tagra
 
Real-time applications with sockets and websockets. Introduction to Smartfoxs...
Real-time applications with sockets and websockets. Introduction to Smartfoxs...Real-time applications with sockets and websockets. Introduction to Smartfoxs...
Real-time applications with sockets and websockets. Introduction to Smartfoxs...Pablo Monterde Perez
 
Improving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time SparkImproving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time Sparkdatamantra
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containersaspyker
 

Similar a Realtime classroom analytics powered by apache druid (20)

(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud
 
Engineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudEngineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the Cloud
 
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
demo
demo demo
demo
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases DistributedRedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
Instruments to play microservice
Instruments to play microserviceInstruments to play microservice
Instruments to play microservice
 
Druid @ branch
Druid @ branch Druid @ branch
Druid @ branch
 
Design patterns for scaling web applications
Design patterns for scaling web applicationsDesign patterns for scaling web applications
Design patterns for scaling web applications
 
Java Based RFID Attendance Management System Graduation Project Presentation
Java Based RFID Attendance Management System Graduation Project PresentationJava Based RFID Attendance Management System Graduation Project Presentation
Java Based RFID Attendance Management System Graduation Project Presentation
 
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
How Precisely and Splunk Can Help You Better Manage Your IBM Z and IBM i Envi...
 
Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@Grab
 
Vedantu @ Kranky Geek
Vedantu @ Kranky GeekVedantu @ Kranky Geek
Vedantu @ Kranky Geek
 
Real-time applications with sockets and websockets. Introduction to Smartfoxs...
Real-time applications with sockets and websockets. Introduction to Smartfoxs...Real-time applications with sockets and websockets. Introduction to Smartfoxs...
Real-time applications with sockets and websockets. Introduction to Smartfoxs...
 
Improving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time SparkImproving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time Spark
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containers
 

Último

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Realtime classroom analytics powered by apache druid

  • 1. Realtime Classroom Analytics Powered By Apache Druid Karthik Deivasigamani, Chief Architect, Noon - The Social Learning Platform
  • 2. Agenda ● Who We Are ● Live Online Classroom ● Quality Of Experience ● Why Apache Druid ● Realtime Classroom Monitoring ● Key Lessons ● Q & A
  • 3. Who Are We? Noon has evolved into a ‘Social Learning’ platform three years ago to craft the most engaging learning experience. ● Our mission is to radically change the way people learn. ● Make learning more social and fun. ● 10M+ users from over 5 countries ● 1M+ MAU with 50+ mins per active day per student
  • 4. Live Online Classroom Students spend a significant amount of their time on Noon learning from their teacher within the online classrooms. Classroom Features ● Video, Audio, Chat and Whiteboard ● Breakouts, Raise Hand ● Peak 10K students / session
  • 5. Live Classroom - Challenges Audio Voice is broken ● Teacher’s uplink quality ● Issues with microphone ● Student’s downlink quality ● ISP policies Whiteboard Lag in whiteboard ● Loss of drawing events due to unstable network ● Heavy CPU usage on the mobile device ● Software Bug
  • 6. Quality Of Experience “Quality of experience is a measure of the delight or annoyance of a customer's experiences with a service.” - Wikipedia
  • 7. Monitoring The Classroom Metrics ● Uplink/Downlink Network Quality ● Packet Loss ● Remote/Local Audio Quality ● Mic Status ● Jitter Buffer Delay ● frameFrozenRate ● Uplink/Downlink BitRate Dimensions ● Country ● Region ● City ● Session ● User ● ISP ● Network Type Aggregations ● Percentile ● Count ● Average ● Distinct Count ● Standard Deviation
  • 8. System Characteristics ● Real Time Ingestion ● Scale Horizontally ● High Cardinality Data ● Subsecond Query Latency ● Fast Aggregation ● Zoom In & Zoom Out ● Highly Available
  • 9. Why Apache Druid ● Real Time Ingestion From Kafka Through Spec Files ● Data & Query Nodes Allows For Horizontal Scaling ● Sketches For High Cardinality Columns ● Low-Latency Querying ● Rich Built In Capabilities For Exact & Approx Aggregation ● Data Rollups ● Fault Tolerance At Multiple Levels
  • 10. Data Collection - Network & Audio WebRTC Stats Sent BitRate Received BitRate Audio Packet Loss Audio Level Bytes Sent/Received Audio Frame Freeze Rate Network Quality Audio Quality
  • 11. Data Collection - Whiteboard Whiteboard Stats Stroke Difference Drift Percentage
  • 12. Ingestion ● All ingestions happen via Kafka in real time ● Flink Topology ● Split & Format to conform with ingestion spec ● Rollup Enabled At Ingestion Time ● Conditional transformation ● Looking forward to using Lag Based AutoScaler.
  • 13. Making Ingestion Easy ● Well defined event (ProtoBuf) schema serialized as JSON. ● Jsonpath based DSL defining transformers & ingestion spec. ● Parsing & Transformation based on the configuration file in a flink topology. ● Ingestion Spec Auto Generated from JSON configuration file. ● Automated Deployments Via Jenkins
  • 14. Schema Design ● Always start from your use-cases. ● Identify Dimensions & Metrics ● Aggregations & Approximation (hyperloglog, quantiles sketches) ● Query Granularity ● Partitions ● Deep Storage ● Data Retention
  • 15. Self Serve Dashboard - Zoom Out & Zoom In Country Level View Sessions Inside A Country Session Level View Students Inside A Session View Student Session Level View
  • 16. Our Druid Cluster Topology ● Master (m5.2xl) ● Data Node (i3.2xl) ○ Tiered ○ 24 slots ● Query Node (m5.2xl) ● External ZK, MySQL, S3 Deep Storage Monitoring Numbers ● Datadog-Druid ● System Resources ● Ingestion Lag ● Number of Segments ● Query Time ● JVM Memory Usage ● 15+ dims, 50+ metrics ● 105 M events per day ● 2B rows @ Avg Row Size 1K ● 4k-5k Segment ● p90 latency ~ 850 ms
  • 18. Business Impact ● Quickly Identify Problems ● Validation of fixes put in to improve quality ● Self Serve Tool, reducing burden on developers ● Improved transparency & trust between OPS and developers ● Student NPS score improved
  • 19. Challenges & Key Lessons ● Rollups are your best friend ● Ingestion Time Transformation > Query Time Transformation ● Approximation - Hyperloglog, Data Sketches ● Late Arrival Of Messages & Compaction ● Query Performance depends on your data model ● Setup takes time to stabilize. ● druid-user group is super helpful!