SlideShare una empresa de Scribd logo
1 de 14
Descargar para leer sin conexión
Using Splunk To
Evaluate 20 Billion
 Ad Impressions
     Monthly
        Isaac Mosquera, CTO
  @imosquera • isaac.mosquera@getsocialize.com
A Little Bit About Real-Time Bidding



              Ad Request                     Bid Request
                                   R
           Winning Bidder's Ad     T         Bid Response
                                   B                        Socialize
                                                             Bidder
                             Ad Impression

                                 Ad Click




All this needs to happen in less than 100 milliseconds!
So what are some of our problems?
 Operational
  ●   Evaluating more than 10,000 bid requests per second
  ●   Which bids are > 100ms
  ●   Quickly finding any errors within the system
  ●   Problems tracking clicks and impressions means loss of
      revenue.

 Decision Making & Bid Algorithms
  ●   Merging RTB data with our Social data
  ●   Campaign spending
  ●   Campaign efficiency
  ●   Dissect data by:
       ○ apps
       ○ users
       ○ devices
Analyzing Big Data Efficiently


1.   Collection
2.   Storage
3.   Analyzation/Aggregation
4.   Retrieval
Some Options
● RDBMS: SQL functions like count() creates
  presents problems at scale

● RDBMS: Write operations too high for a single DB,
  as well as a single point of failure.

● NoSQL: Would work well for high inserts and
  queries, however we would lose the simple
  alerting, charting and reporting dashboards.

● Hadoop: simple querying using Hive, however it's
  a new environment to manage... and again lose
  alerting, charting and reporting.
Splunk Fits the Bill
● Operational Reporting: Easily identify problems
  and prevent erroneous spending. When an alert
  goes off we hit a script which shuts off the bidder.

● AdHoc Queries: Allows us to find patterns in the
  data to improve our bid algorithms

● Application Reporting: Instantly know campaign
  metrics for us and our clients.
      "This has got to be the most thorough mobile campaign report I've
      ever received, so major props to all of you." - Hipmunk Marketing

● Scalability: Adding new RTB Service providers
  means billions of new ad requests. Scaling
  horizontally is key.
Data Collection
 ● Although Splunk works great with unstructured data, we
   need some structure to make querying easy.

 ● Created a small client to push events to Splunk indexers:




 ● Very Simple, accepts only 2 fields: event name, Metadata
   (dictionary)

 ● Events are application data like bid requests, clicks,
   impressions, and application installs
What do our logs look like?
Storage
● Performance and redundancy using new Provisioned IOPS
  for high I/O

● Nightly snapshots to S3                    Socialize Bidder



● Logs are gzipped by Splunk
  before being snapshotted for
                                 Splunk Indexer            Splunk Indexer
  70% compression gains.
                                      EBS                       EBS
● Continuously indexed by
  Splunk so reports can even
  be done in real-time

                                             S3 Backups
Using Splunk to Analyze Operational Data
 Allows you to write MapReduce jobs with SQL style
 querying language:
 source="nginx-prod.log" | stats avg(ResponseTime) as
 avg_rtime, p95(ResponseTime) as p95_rtime , stdev
 (ResponseTime) as stdev_rtime


 Easily digest information through charts
Analyzation/Aggregation
index=ad_events displayed_ad
| spath
| bin _time span=1m
| stats count(displayed_ad) as displays
     sum(price/1000) as dollars_spent
     avg(price) as avg_cpm_price
     by campaign_id _time
| mysqloutput spec=ads-prod table=ads_analytics
  insert="campaign_id, stat_date, displays, dollars_spent, avg_cpm_price"


        Splunk

        Indexer

                              Search
        Indexer                                          RDBMS
                               Head
                                                   (Generated Reports)

        Indexer
Retrieval
● MySQL and Memcache allows for super fast retrieval of
  aggregated reports

● Use aggregated information to make smarter bids

                         Socialize Bidder




                          Cache Cluster

              Memcache     Memcache         Memcache




                             RDBMS
Final Architecture
                 Socialize Bidder




  Splunk                                   Cache Cluster
  Indexer                     Memcache       Memcache      Memcache

  Indexer


  Indexer



                 Search
                                              RDBMS
                  Head                   (Generated Reports)
   S3
Snapshots
Thank you!
isaac.mosquera@getsocialize.com | @imosquera

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Snowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.comSnowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.com
 
Presentation Data Council Meetup: F. Mekkenholt, R. Vlijm
Presentation Data Council Meetup: F. Mekkenholt, R. VlijmPresentation Data Council Meetup: F. Mekkenholt, R. Vlijm
Presentation Data Council Meetup: F. Mekkenholt, R. Vlijm
 
How to evolve your analytics stack with your business using Snowplow
How to evolve your analytics stack with your business using SnowplowHow to evolve your analytics stack with your business using Snowplow
How to evolve your analytics stack with your business using Snowplow
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
Power BI Streaming Datasets - San Diego BI Users Group
Power BI Streaming Datasets - San Diego BI Users GroupPower BI Streaming Datasets - San Diego BI Users Group
Power BI Streaming Datasets - San Diego BI Users Group
 
Google cloud platform Introduction - 2014
Google cloud platform Introduction - 2014Google cloud platform Introduction - 2014
Google cloud platform Introduction - 2014
 
Donghai Xu summer 2020 intern
Donghai Xu summer 2020 internDonghai Xu summer 2020 intern
Donghai Xu summer 2020 intern
 
Enriching data by_cooking_recipes_in_cloud_dataprep
Enriching data by_cooking_recipes_in_cloud_dataprepEnriching data by_cooking_recipes_in_cloud_dataprep
Enriching data by_cooking_recipes_in_cloud_dataprep
 
Functional programming-in-the-cloud
Functional programming-in-the-cloudFunctional programming-in-the-cloud
Functional programming-in-the-cloud
 
Implementing Analytics in High-Traffic Social Games
Implementing Analytics in High-Traffic Social GamesImplementing Analytics in High-Traffic Social Games
Implementing Analytics in High-Traffic Social Games
 
crawl technology saves money and time
crawl technology saves money and timecrawl technology saves money and time
crawl technology saves money and time
 
Democratizing Artificial Intelligence
Democratizing Artificial IntelligenceDemocratizing Artificial Intelligence
Democratizing Artificial Intelligence
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Privacy preserving deep computation model on cloud for big data feature learning
Privacy preserving deep computation model on cloud for big data feature learningPrivacy preserving deep computation model on cloud for big data feature learning
Privacy preserving deep computation model on cloud for big data feature learning
 
Howtomakeyourown gi sdashboard
Howtomakeyourown gi sdashboardHowtomakeyourown gi sdashboard
Howtomakeyourown gi sdashboard
 
SupriseMe - Personal notification platform - PoC
SupriseMe - Personal notification platform - PoCSupriseMe - Personal notification platform - PoC
SupriseMe - Personal notification platform - PoC
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
01 supermapiportaloverview
01 supermapiportaloverview01 supermapiportaloverview
01 supermapiportaloverview
 
Modelling event data in look ml
Modelling event data in look mlModelling event data in look ml
Modelling event data in look ml
 
0 supermapproductsintroduction
0 supermapproductsintroduction0 supermapproductsintroduction
0 supermapproductsintroduction
 

Similar a Isaac Mosquera, Socialize CTO SplunkLive! presentation

Similar a Isaac Mosquera, Socialize CTO SplunkLive! presentation (20)

L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 
Building a real-time, scalable and intelligent programmatic ad buying platform
Building a real-time, scalable and intelligent programmatic ad buying platformBuilding a real-time, scalable and intelligent programmatic ad buying platform
Building a real-time, scalable and intelligent programmatic ad buying platform
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
 
Microservices at ibotta pitfalls and learnings
Microservices at ibotta pitfalls and learningsMicroservices at ibotta pitfalls and learnings
Microservices at ibotta pitfalls and learnings
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
 
Big Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Big Data LDN 2017: Delivering Instant Experience with Redid EnterpriseBig Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Big Data LDN 2017: Delivering Instant Experience with Redid Enterprise
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
AWS Architecture Case Study: Real-Time Bidding
AWS Architecture Case Study: Real-Time BiddingAWS Architecture Case Study: Real-Time Bidding
AWS Architecture Case Study: Real-Time Bidding
 
Discover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statement
 
Amazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon DynamoDB - Auto Scaling Webinar - v3.pptxAmazon DynamoDB - Auto Scaling Webinar - v3.pptx
Amazon DynamoDB - Auto Scaling Webinar - v3.pptx
 
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
Real-Time Forecasting at Scale using Delta Lake and Delta CachingReal-Time Forecasting at Scale using Delta Lake and Delta Caching
Real-Time Forecasting at Scale using Delta Lake and Delta Caching
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
 
Analytics what to look for sustaining your growing business-
Analytics   what to look for sustaining your growing business-Analytics   what to look for sustaining your growing business-
Analytics what to look for sustaining your growing business-
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 
Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...
Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...
Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
 

Más de getsocialize (8)

Response to uspto on the first topic v5
Response to uspto on the first topic   v5Response to uspto on the first topic   v5
Response to uspto on the first topic v5
 
The Socialize platform
The Socialize platformThe Socialize platform
The Socialize platform
 
Being a mobile entrepreneur
Being a mobile entrepreneurBeing a mobile entrepreneur
Being a mobile entrepreneur
 
All about apps
All about appsAll about apps
All about apps
 
Socialize Mobil:e + Social at LavaCon for conetnt strategists
Socialize Mobil:e + Social at LavaCon for conetnt strategistsSocialize Mobil:e + Social at LavaCon for conetnt strategists
Socialize Mobil:e + Social at LavaCon for conetnt strategists
 
About socialize
About socializeAbout socialize
About socialize
 
About socialize
About socializeAbout socialize
About socialize
 
Introducing Socialize
Introducing SocializeIntroducing Socialize
Introducing Socialize
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 

Último (20)

PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 

Isaac Mosquera, Socialize CTO SplunkLive! presentation

  • 1. Using Splunk To Evaluate 20 Billion Ad Impressions Monthly Isaac Mosquera, CTO @imosquera • isaac.mosquera@getsocialize.com
  • 2. A Little Bit About Real-Time Bidding Ad Request Bid Request R Winning Bidder's Ad T Bid Response B Socialize Bidder Ad Impression Ad Click All this needs to happen in less than 100 milliseconds!
  • 3. So what are some of our problems? Operational ● Evaluating more than 10,000 bid requests per second ● Which bids are > 100ms ● Quickly finding any errors within the system ● Problems tracking clicks and impressions means loss of revenue. Decision Making & Bid Algorithms ● Merging RTB data with our Social data ● Campaign spending ● Campaign efficiency ● Dissect data by: ○ apps ○ users ○ devices
  • 4. Analyzing Big Data Efficiently 1. Collection 2. Storage 3. Analyzation/Aggregation 4. Retrieval
  • 5. Some Options ● RDBMS: SQL functions like count() creates presents problems at scale ● RDBMS: Write operations too high for a single DB, as well as a single point of failure. ● NoSQL: Would work well for high inserts and queries, however we would lose the simple alerting, charting and reporting dashboards. ● Hadoop: simple querying using Hive, however it's a new environment to manage... and again lose alerting, charting and reporting.
  • 6. Splunk Fits the Bill ● Operational Reporting: Easily identify problems and prevent erroneous spending. When an alert goes off we hit a script which shuts off the bidder. ● AdHoc Queries: Allows us to find patterns in the data to improve our bid algorithms ● Application Reporting: Instantly know campaign metrics for us and our clients. "This has got to be the most thorough mobile campaign report I've ever received, so major props to all of you." - Hipmunk Marketing ● Scalability: Adding new RTB Service providers means billions of new ad requests. Scaling horizontally is key.
  • 7. Data Collection ● Although Splunk works great with unstructured data, we need some structure to make querying easy. ● Created a small client to push events to Splunk indexers: ● Very Simple, accepts only 2 fields: event name, Metadata (dictionary) ● Events are application data like bid requests, clicks, impressions, and application installs
  • 8. What do our logs look like?
  • 9. Storage ● Performance and redundancy using new Provisioned IOPS for high I/O ● Nightly snapshots to S3 Socialize Bidder ● Logs are gzipped by Splunk before being snapshotted for Splunk Indexer Splunk Indexer 70% compression gains. EBS EBS ● Continuously indexed by Splunk so reports can even be done in real-time S3 Backups
  • 10. Using Splunk to Analyze Operational Data Allows you to write MapReduce jobs with SQL style querying language: source="nginx-prod.log" | stats avg(ResponseTime) as avg_rtime, p95(ResponseTime) as p95_rtime , stdev (ResponseTime) as stdev_rtime Easily digest information through charts
  • 11. Analyzation/Aggregation index=ad_events displayed_ad | spath | bin _time span=1m | stats count(displayed_ad) as displays sum(price/1000) as dollars_spent avg(price) as avg_cpm_price by campaign_id _time | mysqloutput spec=ads-prod table=ads_analytics insert="campaign_id, stat_date, displays, dollars_spent, avg_cpm_price" Splunk Indexer Search Indexer RDBMS Head (Generated Reports) Indexer
  • 12. Retrieval ● MySQL and Memcache allows for super fast retrieval of aggregated reports ● Use aggregated information to make smarter bids Socialize Bidder Cache Cluster Memcache Memcache Memcache RDBMS
  • 13. Final Architecture Socialize Bidder Splunk Cache Cluster Indexer Memcache Memcache Memcache Indexer Indexer Search RDBMS Head (Generated Reports) S3 Snapshots