SlideShare una empresa de Scribd logo
1 de 26
Ingesting click data for analytics
Francesco Furiani, CTO @
$ whoami
Francesco Furiani (@ilfurio):
 Backend Engineer
 Roamed these halls not too long ago
Ingesting clicks data for analytics
Loves:
 Studying new CS stuff
 PlayStation / Bike / Traveling / Soccer
 O RLY? books
How do I make a living:
 CTO @ ClickMeter
 Backend Engineer @ ClickMeter
 Enum.take_random(IT_ROLES,1) @ ClickMeter
Ingesting clicks data for analytics
ClickMeter
 100k+ customers
 Getting events for customers from 10 to 3000 req/sec
Ingesting clicks data for analytics
ClickMeter
We receive data anytime someone:
 Clicks our links
 Views our pixels
 Calls our postbacks
Our customers use us:
 Inside a famous app the day of the big release ✔
 Advertising on an extremely big video portal ✔
 A tiny travel blog ✔
 A physical device for advertising ✔
Ingesting clicks data for analytics
Getting the data
We need to:
 Try not to lose the events we receive (duh)
 Show customers data for better insight on their campaigns
 Scale up/down according to the incoming fluxes
 Improve the product by using the data we get
 Do it as fast as possible (wasn’t this ready a week ago?)
 Do it as cheap as possible
Ingesting clicks data for analytics
The challenge
Find the size of the problem you’re trying to solve
 How much data do you expect? Rate?
 What do you have to do with it?
 Do I have to do something with ALL of it?
 How fast do I have to do it?
Answers to these questions are a starting point.
Ingesting clicks data for analytics
Size
Once we know how big and bad the beast is, we
need to design the ranch that will keep it in check.
Iterative process and prone to a lot of failures, but
the world is out there to help us.
Think, write and draw a lot.
Ingesting clicks data for analytics
Design
… draw too much ...
Ingesting clicks data for analytics
Design
Most of us will never have the joy (and the horror) of
creating a new stack, novel in theory and practice.
Still we need to understand the theory behind every
brick.
Read the info, read the opinions, try little proof of
concept of the moving parts, it helps a lot!
Ingesting clicks data for analytics
Which bricks should I use
A very important brick.
Elasticity of computation power, many *aaS, managed solutions are
really a great help in terms of saved manpower and fast iterations.
It comes at a great cost to consider:
• $$$ (ymmv)
• Possible lock-ins
Ingesting clicks data for analytics
The cloud is a brick too
… well it’s never definitive ...
Ingesting clicks data for analytics
Design with bricks
Obviously we haven’t followed those guidelines.
One becomes savvy after crashing and burning
many times.
But still thanks to those errors we got there and
built, at every iteration, a better infrastructure.
Ingesting clicks data for analytics
How we did it
ClickMeter was already live and growing
It needed an overhaul in its infrastructure/backend.
The growth fueled the need to be ready for more power to handle more data.
Obviously this had to be a tablecloth trick migration 
Ingesting clicks data for analytics
How we did it
Already on the cloud (AWS), we thought of having a hybrid approach but it didn’t
make sense.
Review of old components already in production to see what to kill, keep or
update.
Kept good stuff and designed some new layers to make them work flawlessly in
the new infrastructure.
Ingesting clicks data for analytics
How we did it
Ingesting clicks data for analytics
Pretty important, they need to:
• Stay up
• Scale up/down depending on the incoming traffic
• Never lose anything
• Be as fast as possible in processing
They’re a custom web app application that undergoes a lot of testing.
We used stuff like Beanstalk, Scaling groups, Load Balancers and Health routing
offered by our cloud provider to manage the webapp scaling/availability
Ingesting clicks data for analytics
Redirect engine
aka events collector
Pipeline
Most of this part uses our cloud provider
technology.
This simplifies maintenance and provisioning,
keeping the focus on the value of our product.
Some moving parts are custom made by us to
interact with the cloud technology (might be
proprietary or just repackaged known one).
Ingesting clicks data for analytics
Tracking engine
and friends
SQS Pipeline
Kinesis
• Events • Preprocessing
• Postprocessing
• DynamoDB
Ingesting clicks data for analytics
Tracking engine
and friends
Combination of real-time and batch
technologies.
One of the scaling parts that actually provides
value to the customers.
Computes analysis on events data from a
simple count to some predictions.
Check the data produced by your processing
system to improve the pipeline step-by-step!
Ingesting clicks data for analytics
Pipeline
Ingesting clicks data for analytics
Pipeline
We employ different storage based on speed of delivery and data type.
All the data is accessible via a REST API.
This permits to develop a frontend layer with relative ease and allows customers
to take control of the data and use it in a way we may have not considered.
Ingesting clicks data for analytics
Storage and data delivery
Managed services on the cloud help us a lot!
Most of the team can focus on improvements
and shipping (users are happy, so is the CEO).
Some of us (me) still have to be the
CloudOp/DevOp.
p.s.: always prepare a Plan B for when you’ll
break things!
Ingesting clicks data for analytics
Operations
Cloud is typically more expensive of your own metal.
This extra money you have to spend is actually well spent:
• Flexibility
• Easier provisioning
• Easier management
• Easier operations
There are different types of clouds, so choose wisely.
Ingesting clicks data for analytics
Cloud co$t$
Creating and managing a “big data” ready infrastructure is no easy task,
but it can be done step-by-step also by startups.
The cloud is a cool starting ground providing you with many of the toys
you need, so you can focus on what part of “big data” gives you value!
Use the wisdom shared by the big/medium players that have already
been there (and built most of the stuff you’re using).
Ingesting clicks data for analytics
Conclusions
Thank You
Any questions?
@il_furio
francesco@clickmeter.com

Más contenido relacionado

La actualidad más candente

Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartFlows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartSalesforce Admins
 
Join 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and PythonJoin 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and PythonLooker
 
Why, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsWhy, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsDhruv Gohil
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Looker
 
Build a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 MinutesBuild a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 MinutesCaserta
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoptionQubole
 
An Introduction to ORYX Software
An Introduction to ORYX SoftwareAn Introduction to ORYX Software
An Introduction to ORYX SoftwareAccountagility
 
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Cathrine Wilhelmsen
 
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)Cathrine Wilhelmsen
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic WorksMILL5
 
Data Democracy: Hadoop + Redshift
Data Democracy: Hadoop + RedshiftData Democracy: Hadoop + Redshift
Data Democracy: Hadoop + RedshiftLooker
 
SplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started SessionSplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started SessionSplunk
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauDATAVERSITY
 
Why You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the CloudWhy You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the CloudEktron
 
SplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On SessionSplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On SessionSplunk
 
#DataOnCloud New York Event
#DataOnCloud New York Event#DataOnCloud New York Event
#DataOnCloud New York EventHARMAN Services
 
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)Cathrine Wilhelmsen
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionSplunk
 
Cloud Management for MSPs
Cloud Management for MSPsCloud Management for MSPs
Cloud Management for MSPsRightScale
 
Path to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZENPath to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZENkloia
 

La actualidad más candente (20)

Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartFlows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
 
Join 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and PythonJoin 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and Python
 
Why, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsWhy, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For Startups
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
 
Build a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 MinutesBuild a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 Minutes
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption
 
An Introduction to ORYX Software
An Introduction to ORYX SoftwareAn Introduction to ORYX Software
An Introduction to ORYX Software
 
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
 
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic Works
 
Data Democracy: Hadoop + Redshift
Data Democracy: Hadoop + RedshiftData Democracy: Hadoop + Redshift
Data Democracy: Hadoop + Redshift
 
SplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started SessionSplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started Session
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
 
Why You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the CloudWhy You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the Cloud
 
SplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On SessionSplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On Session
 
#DataOnCloud New York Event
#DataOnCloud New York Event#DataOnCloud New York Event
#DataOnCloud New York Event
 
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Cloud Management for MSPs
Cloud Management for MSPsCloud Management for MSPs
Cloud Management for MSPs
 
Path to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZENPath to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZEN
 

Similar a Ingesting Click Data for Analytics

Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida CLARA CAMPROVIN
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessInside Analysis
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataTreasure Data, Inc.
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Looker
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationInside Analysis
 
Keynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it CanadaKeynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it CanadaAmazon Web Services
 
Gov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewGov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewSplunk
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudInside Analysis
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with DatabricksGrega Kespret
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesJeff Bertman
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스Amazon Web Services Korea
 
AWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAmazon Web Services
 
All you need to know about yelowsofts new version update
All you need to know about yelowsofts new version updateAll you need to know about yelowsofts new version update
All you need to know about yelowsofts new version updateYelowsoft
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...Nelson Petracek
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...InfluxData
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAmazon Web Services
 

Similar a Ingesting Click Data for Analytics (20)

Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_data
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with Automation
 
Keynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it CanadaKeynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it Canada
 
Gov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewGov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/Overview
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
AWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AI
 
All you need to know about yelowsofts new version update
All you need to know about yelowsofts new version updateAll you need to know about yelowsofts new version update
All you need to know about yelowsofts new version update
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
 

Último

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 

Último (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 

Ingesting Click Data for Analytics

  • 1. Ingesting click data for analytics Francesco Furiani, CTO @
  • 2. $ whoami Francesco Furiani (@ilfurio):  Backend Engineer  Roamed these halls not too long ago Ingesting clicks data for analytics Loves:  Studying new CS stuff  PlayStation / Bike / Traveling / Soccer  O RLY? books How do I make a living:  CTO @ ClickMeter  Backend Engineer @ ClickMeter  Enum.take_random(IT_ROLES,1) @ ClickMeter
  • 3. Ingesting clicks data for analytics ClickMeter
  • 4.  100k+ customers  Getting events for customers from 10 to 3000 req/sec Ingesting clicks data for analytics ClickMeter
  • 5. We receive data anytime someone:  Clicks our links  Views our pixels  Calls our postbacks Our customers use us:  Inside a famous app the day of the big release ✔  Advertising on an extremely big video portal ✔  A tiny travel blog ✔  A physical device for advertising ✔ Ingesting clicks data for analytics Getting the data
  • 6. We need to:  Try not to lose the events we receive (duh)  Show customers data for better insight on their campaigns  Scale up/down according to the incoming fluxes  Improve the product by using the data we get  Do it as fast as possible (wasn’t this ready a week ago?)  Do it as cheap as possible Ingesting clicks data for analytics The challenge
  • 7. Find the size of the problem you’re trying to solve  How much data do you expect? Rate?  What do you have to do with it?  Do I have to do something with ALL of it?  How fast do I have to do it? Answers to these questions are a starting point. Ingesting clicks data for analytics Size
  • 8. Once we know how big and bad the beast is, we need to design the ranch that will keep it in check. Iterative process and prone to a lot of failures, but the world is out there to help us. Think, write and draw a lot. Ingesting clicks data for analytics Design
  • 9. … draw too much ... Ingesting clicks data for analytics Design
  • 10. Most of us will never have the joy (and the horror) of creating a new stack, novel in theory and practice. Still we need to understand the theory behind every brick. Read the info, read the opinions, try little proof of concept of the moving parts, it helps a lot! Ingesting clicks data for analytics Which bricks should I use
  • 11. A very important brick. Elasticity of computation power, many *aaS, managed solutions are really a great help in terms of saved manpower and fast iterations. It comes at a great cost to consider: • $$$ (ymmv) • Possible lock-ins Ingesting clicks data for analytics The cloud is a brick too
  • 12. … well it’s never definitive ... Ingesting clicks data for analytics Design with bricks
  • 13. Obviously we haven’t followed those guidelines. One becomes savvy after crashing and burning many times. But still thanks to those errors we got there and built, at every iteration, a better infrastructure. Ingesting clicks data for analytics How we did it
  • 14. ClickMeter was already live and growing It needed an overhaul in its infrastructure/backend. The growth fueled the need to be ready for more power to handle more data. Obviously this had to be a tablecloth trick migration  Ingesting clicks data for analytics How we did it
  • 15. Already on the cloud (AWS), we thought of having a hybrid approach but it didn’t make sense. Review of old components already in production to see what to kill, keep or update. Kept good stuff and designed some new layers to make them work flawlessly in the new infrastructure. Ingesting clicks data for analytics How we did it
  • 16. Ingesting clicks data for analytics
  • 17. Pretty important, they need to: • Stay up • Scale up/down depending on the incoming traffic • Never lose anything • Be as fast as possible in processing They’re a custom web app application that undergoes a lot of testing. We used stuff like Beanstalk, Scaling groups, Load Balancers and Health routing offered by our cloud provider to manage the webapp scaling/availability Ingesting clicks data for analytics Redirect engine aka events collector
  • 18. Pipeline Most of this part uses our cloud provider technology. This simplifies maintenance and provisioning, keeping the focus on the value of our product. Some moving parts are custom made by us to interact with the cloud technology (might be proprietary or just repackaged known one). Ingesting clicks data for analytics Tracking engine and friends
  • 19. SQS Pipeline Kinesis • Events • Preprocessing • Postprocessing • DynamoDB Ingesting clicks data for analytics Tracking engine and friends
  • 20. Combination of real-time and batch technologies. One of the scaling parts that actually provides value to the customers. Computes analysis on events data from a simple count to some predictions. Check the data produced by your processing system to improve the pipeline step-by-step! Ingesting clicks data for analytics Pipeline
  • 21. Ingesting clicks data for analytics Pipeline
  • 22. We employ different storage based on speed of delivery and data type. All the data is accessible via a REST API. This permits to develop a frontend layer with relative ease and allows customers to take control of the data and use it in a way we may have not considered. Ingesting clicks data for analytics Storage and data delivery
  • 23. Managed services on the cloud help us a lot! Most of the team can focus on improvements and shipping (users are happy, so is the CEO). Some of us (me) still have to be the CloudOp/DevOp. p.s.: always prepare a Plan B for when you’ll break things! Ingesting clicks data for analytics Operations
  • 24. Cloud is typically more expensive of your own metal. This extra money you have to spend is actually well spent: • Flexibility • Easier provisioning • Easier management • Easier operations There are different types of clouds, so choose wisely. Ingesting clicks data for analytics Cloud co$t$
  • 25. Creating and managing a “big data” ready infrastructure is no easy task, but it can be done step-by-step also by startups. The cloud is a cool starting ground providing you with many of the toys you need, so you can focus on what part of “big data” gives you value! Use the wisdom shared by the big/medium players that have already been there (and built most of the stuff you’re using). Ingesting clicks data for analytics Conclusions

Notas del editor

  1. What we do: Control of marketing links and maximize conversion rates Tool to monitor, compare and optimize all their links in one place
  2. The whole picture