SlideShare una empresa de Scribd logo
1 de 72
Descargar para leer sin conexión
@ItaiYaffe
●
○
○
○
○
@ItaiYaffe
●
○
○
○
○
●
○
○
○
@ItaiYaffe
●
○
@ItaiYaffe
● 😉
@ItaiYaffe
●
●
●
@ItaiYaffe
…
●
●
@ItaiYaffe
●
○
○
●
○
○
○
●
@ItaiYaffe
●
●
●
●
●
@ItaiYaffe
>10B events/day >20TB/day
S3
1000’s nodes/day 10’s of TB
ingested/day
druid
$100K’s/month
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
●
○
●
●
●
○
●
●
@ItaiYaffe
●
○
●
●
●
○
○
○
@ItaiYaffe
●
○
●
●
●
○
○
○
@ItaiYaffe
Awareness
Exposed to
campaign (e.g
via online ad)
Consideration
Interest is
expressed (e.g
clicked ad)
Intent
Steps taken towards
making a purchase (e.g
added product to cart)
Purchase
@ItaiYaffe
Awareness
Exposed to
campaign (e.g
via online ad)
Consideration
Interest is
expressed (e.g
clicked ad)
Intent
Steps taken towards
making a purchase (e.g
added product to cart)
Purchase
Tactic Stages
@ItaiYaffe
Awareness Consideration Intent Purchase
Drop-
off
Drop-
off
Drop-
off
@ItaiYaffe
PRODUCT PAGE
10M UUs
CHECKOUT
3M UUs
HOMEPAGE
15M UUs
7M
Drop-off
5M
Drop-off
AD EXPOSURE
100M UUs
85M
Drop-off
* UUs = Unique Users
@ItaiYaffe
2 Users
7 Views
2 Purchases $$$ $$$
@ItaiYaffe
PRODUCT PAGE
10M
CHECKOUT
3M
HOMEPAGE
15M
7M
Drop-off
5M
Drop-off
AD EXPOSURE
100M
85M
Drop-off
@ItaiYaffe
●
●
○
●
@ItaiYaffe
…
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
●
●
●
●
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
●
●
○
○
●
@ItaiYaffe
…
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
{event_time=2020-01-28T..., userid=uid1, attribute=online_ad}
{event_time=2020-01-28T..., userid=uid1, attribute=homepage}
{event_time=2020-01-28T..., userid=uid1, attribute=productX_page}
....
@ItaiYaffe
{event_time=2020-01-28T... , userid=uid1, attribute=online_ad, type=Tactic}
{event_time=2020-01-28T... , userid=uid1, attribute=homepage, type=Stage}
{event_time=2020-01-28T... , userid=uid1, attribute=productX_page , type=Stage}
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=productX_page }
....
....
@ItaiYaffe
"type": "index_hadoop",
"spec": {
"dataSchema": {
"dataSource": "campaign_1472",
"granularitySpec": {
"queryGranularity": "day",
"segmentGranularity": "day",
"type": "uniform",
"intervals": ["2020-01-01/2020-01-29"]
...
@ItaiYaffe
"timestampSpec": {
"column": "event_date", "format": "yyyy-MM-dd"
},
"dimensionsSpec": {
"dimensions": ["tactic", "stage"]
},
"metricsSpec": [{
"fieldName": "userid", "type": "thetaSketch",
"name": "user_id_sketch", "size": 65536}],
...
@ItaiYaffe
"inputSpec": {"type": " multi",
"children": [
{"type": " dataSource",
"ingestionSpec": {
"intervals": ["2020-01-01/2020-01-29"],
"dataSource": "campaign_1472", ...}},
{"type": " static",
"Paths": "s3://<BUCKET_NAME>/date=2020-01-28/campaign=1472",
...},
...
@ItaiYaffe
{__time=2020-01-28, tactic=online_ad, stage=homepage, user_id_sketch=<Object>}
{__time=2020-01-28, tactic=online_ad, stage=productX_page , user_id_sketch=<Object>}
....
....
@ItaiYaffe
{"filter":{"type":"and","fields":[{"type":"or","fields":[
{"type":"selector","dimension":"stage","value":"homepage"
}]},{"type":"or","fields":[{"type":"selector","dimension"
:"tactic","value":"online_ad"}]}]},"intervals":["2020-01-
01T00:00:00.000/2020-01-29T23:59:59.000"],"granularity":"
ALL","dataSource":"campaign_1472","aggregations":[{"filte
r":{"type":"selector","dimension":"stage","value":"homepa
ge"},"aggregator":{"fieldName":"user_id_sketch","size":65
536,"name":"homepage_sketch","type":"thetaSketch"},"type"
:"filtered"}],"queryType":"groupBy","dimensions":[]}
@ItaiYaffe
SELECT
APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch,65536)
as homepage_sketch
FROM campaign_1472
WHERE (("tactic" = 'online_ad')
AND ("stage" = 'homepage'))
AND __time BETWEEN '2020-01-01T00:00:00.000'
AND '2020-01-29T23:59:59.000'
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
PRODUCT PAGE
1K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
PRODUCT PAGE
1K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
@ItaiYaffe
●
○
● …
○
●
@ItaiYaffe
@ItaiYaffe
{event_time=2020-01-28T09:15, userid=uid1, attribute=productX_page}
{event_time=2020-01-28T10:10, userid=uid1, attribute=online_ad}
{event_time=2020-01-28T10:11, userid=uid1, attribute=homepage}
....
@ItaiYaffe
{event_time=2020-01-28T09:15 , userid=uid1, attribute=productX_page , type=Stage}
{event_time=2020-01-28T10:10 , userid=uid1, attribute=online_ad, type=Tactic}
{event_time=2020-01-28T10:11 , userid=uid1, attribute=homepage, type=Stage}
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=productX_page }
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
....
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=productX_page }
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
....
....
@ItaiYaffe
{event_date=2020-01-28, userid=uid1, tactic=online_ad, stage=homepage}
....
....
@ItaiYaffe
{"filter":{"type":"and","fields":[{"type":"or","fields":[{"type":"selector","dimension":"stage","value":"homepage"},{"type":"selecto
r","dimension":"stage","value":"productX_page"},{"type":"selector","dimension":"stage","value":"add_to_cart"},{"type":"selector","di
mension":"stage","value":"checkout"}]},{"type":"or","fields":[{"type":"selector","dimension":"tactic","value":"online_ad"}]}]},"inte
rvals":["2018-12-06T00:00:00.000/2020-01-29T23:59:59.000"],"granularity":"ALL","dataSource":"campaign_974","aggregations":[{"filter"
:{"type":"selector","dimension":"stage","value":"homepage"},"aggregator":{"fieldName":"user_id_sketch","size":65536,"name":"A","type
":"thetaSketch"},"type":"filtered"},{"filter":{"type":"selector","dimension":"tactic","value":"online_ad"},"aggregator":{"fieldName"
:"user_id_sketch","size":65536,"name":"B","type":"thetaSketch"},"type":"filtered"},{"filter":{"type":"selector","dimension":"stage",
"value":"productX_page"},"aggregator":{"fieldName":"user_id_sketch","size":65536,"name":"C","type":"thetaSketch"},"type":"filtered"}
,{"filter":{"type":"selector","dimension":"stage","value":"add_to_cart"},"aggregator":{"fieldName":"user_id_sketch","size":65536,"na
me":"D","type":"thetaSketch"},"type":"filtered"},{"filter":{"type":"selector","dimension":"stage","value":"checkout"},"aggregator":{
"fieldName":"user_id_sketch","size":65536,"name":"E","type":"thetaSketch"},"type":"filtered"}],"postAggregations":[{"field":{"func":
"NOT","size":65536,"name":"(homepage AND online_ad AND ( NOT (productX_page OR add_to_cart OR
checkout)))","type":"thetaSketchSetOp","fields":[{"func":"INTERSECT","size":65536,"name":"(homepage AND online_ad AND ( NOT
(productX_page OR add_to_cart OR
checkout)))","type":"thetaSketchSetOp","fields":[{"fieldName":"A","type":"fieldAccess"},{"fieldName":"B","type":"fieldAccess"}]},{"f
unc":"UNION","size":65536,"name":"(productX_page OR add_to_cart OR
checkout)","type":"thetaSketchSetOp","fields":[{"fieldName":"C","type":"fieldAccess"},{"fieldName":"D","type":"fieldAccess"},{"field
Name":"E","type":"fieldAccess"}]}]},"name":"online_ad_596","type":"thetaSketchEstimate"}],"queryType":"groupBy","dimensions":[]}
@ItaiYaffe
SELECT THETA_SKETCH_NOT(65536,
THETA_SKETCH_INTERSECT(65536,a,b), THETA_SKETCH_UNION(65536,c,d,e)
) as online_ad_596
FROM (
SELECT
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'homepage') as a,
DS_THETA("user_id_sketch") FILTER (WHERE tactic = 'online_ad') as b,
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'productX_page') as c,
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'add_to_cart') as d,
DS_THETA("user_id_sketch") FILTER (WHERE stage = 'checkout') as e
FROM campaign_1472
WHERE stage in ('homepage','productX_page','checkout','add_to_cart')
AND tactic = 'online_ad') subquery
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
@ItaiYaffe
PRODUCT PAGE
0.6K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
PRODUCT PAGE
0.6K UUs
...
HOMEPAGE
3.1K UUs
2.5K
Drop-off
ONLINE AD
8.1M UUs
* UUs = Unique Users
@ItaiYaffe
●
○
●
○
○
●
○
○
@ItaiYaffe
●
○
○
●
○
○
●
○
○
@ItaiYaffe
●
○
■
■
○
○
●
○
○
●
○
Funnel Analysis with Spark and Druid
Funnel Analysis with Spark and Druid

Más contenido relacionado

La actualidad más candente

Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Big Data Spain
 
Petabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerPetabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerRittman Analytics
 
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsLuke Han
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud eventPreetyKhatkar
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. ServicesTigerGraph
 
The Convergence of Data Science and Software Development
The Convergence of Data Science and Software DevelopmentThe Convergence of Data Science and Software Development
The Convergence of Data Science and Software DevelopmentMargriet Groenendijk
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Databricks
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...TigerGraph
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDeepak Chandramouli
 
Net conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsNet conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsGaston Cruz
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingDatabricks
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 TigerGraph
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science teamLars Albertsson
 
Viewbix tracking journey
Viewbix tracking journeyViewbix tracking journey
Viewbix tracking journeyidan_by
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...Tom Diederich
 
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...TigerGraph
 

La actualidad más candente (20)

Big query
Big queryBig query
Big query
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
Petabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and LookerPetabytes to Personalization - Data Analytics with Qubit and Looker
Petabytes to Personalization - Data Analytics with Qubit and Looker
 
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud event
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
 
The Convergence of Data Science and Software Development
The Convergence of Data Science and Software DevelopmentThe Convergence of Data Science and Software Development
The Convergence of Data Science and Software Development
 
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
Journey to Creating a 360 View of the Customer: Implementing Big Data Strateg...
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
Net conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsNet conf uy v2018 real time analytics
Net conf uy v2018 real time analytics
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
 
TechTuesdays Session 2
TechTuesdays Session 2TechTuesdays Session 2
TechTuesdays Session 2
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
 
Viewbix tracking journey
Viewbix tracking journeyViewbix tracking journey
Viewbix tracking journey
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...
 
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
Graph Gurus Episode 35: No Code Graph Analytics to Get Insights from Petabyte...
 

Similar a Funnel Analysis with Spark and Druid

DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidDevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidItai Yaffe
 
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Itai Yaffe
 
Nielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeNielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeImply
 
Ambitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics CustomisationAmbitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics CustomisationiLive Conference
 
Intro to Segment & Tracking for Live Streaming by Livestorm
 Intro to Segment & Tracking for Live Streaming by Livestorm Intro to Segment & Tracking for Live Streaming by Livestorm
Intro to Segment & Tracking for Live Streaming by LivestormLivestorm
 
EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)Anthony Lewis
 
The next level of social integration
The next level of social integrationThe next level of social integration
The next level of social integrationMat Clayton
 
Google Grants for Nonprofits
Google Grants for NonprofitsGoogle Grants for Nonprofits
Google Grants for Nonprofitsprotocol 80
 

Similar a Funnel Analysis with Spark and Druid (8)

DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and DruidDevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
DevTalks Reimagined 2020 - Funnel Analysis with Spark and Druid
 
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
Airflow Summit 2020 - Migrating airflow based spark jobs to kubernetes - the ...
 
Nielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeNielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in Practice
 
Ambitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics CustomisationAmbitious Analytics: Google Analytics Customisation
Ambitious Analytics: Google Analytics Customisation
 
Intro to Segment & Tracking for Live Streaming by Livestorm
 Intro to Segment & Tracking for Live Streaming by Livestorm Intro to Segment & Tracking for Live Streaming by Livestorm
Intro to Segment & Tracking for Live Streaming by Livestorm
 
EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)EECS 441 Company Presentation (Lewisant-Twitch)
EECS 441 Company Presentation (Lewisant-Twitch)
 
The next level of social integration
The next level of social integrationThe next level of social integration
The next level of social integration
 
Google Grants for Nonprofits
Google Grants for NonprofitsGoogle Grants for Nonprofits
Google Grants for Nonprofits
 

Más de Itai Yaffe

Mastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingMastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingItai Yaffe
 
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationSolving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationItai Yaffe
 
Lessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsLessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsItai Yaffe
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Itai Yaffe
 
Evaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesEvaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesItai Yaffe
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Data Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsData Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsItai Yaffe
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your DataItai Yaffe
 
Data Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesData Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesItai Yaffe
 
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Itai Yaffe
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsItai Yaffe
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 
Scalable Incremental Index for Druid
Scalable Incremental Index for DruidScalable Incremental Index for Druid
Scalable Incremental Index for DruidItai Yaffe
 
The benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerThe benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerItai Yaffe
 
Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Itai Yaffe
 
Scheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureScheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureItai Yaffe
 
GraphQL API on a Serverless Environment
GraphQL API on a Serverless EnvironmentGraphQL API on a Serverless Environment
GraphQL API on a Serverless EnvironmentItai Yaffe
 
Serverless data processing built for internet SCALE
Serverless data processing built for internet SCALEServerless data processing built for internet SCALE
Serverless data processing built for internet SCALEItai Yaffe
 
Ask me anything - Women in Big Data Israel
Ask me anything - Women in Big Data IsraelAsk me anything - Women in Big Data Israel
Ask me anything - Women in Big Data IsraelItai Yaffe
 

Más de Itai Yaffe (20)

Mastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data ProcessingMastering Partitioning for High-Volume Data Processing
Mastering Partitioning for High-Volume Data Processing
 
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse AutomationSolving Data Engineers Velocity - Wix's Data Warehouse Automation
Solving Data Engineers Velocity - Wix's Data Warehouse Automation
 
Lessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsLessons Learnt from Running Thousands of On-demand Spark Applications
Lessons Learnt from Running Thousands of On-demand Spark Applications
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"
 
Evaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening NotesEvaluating Big Data & ML Solutions - Opening Notes
Evaluating Big Data & ML Solutions - Opening Notes
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Data Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management MonolithsData Lakes on Public Cloud: Breaking Data Management Monoliths
Data Lakes on Public Cloud: Breaking Data Management Monoliths
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your Data
 
Data Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening NotesData Lake on Public Cloud - Opening Notes
Data Lake on Public Cloud - Opening Notes
 
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
Virtual Apache Druid Meetup: AIADA (Ask Itai and David Anything)
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom Connectors
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Scalable Incremental Index for Druid
Scalable Incremental Index for DruidScalable Incremental Index for Druid
Scalable Incremental Index for Druid
 
The benefits of running Spark on your own Docker
The benefits of running Spark on your own DockerThe benefits of running Spark on your own Docker
The benefits of running Spark on your own Docker
 
Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?Optimizing Spark-based data pipelines - are you up for it?
Optimizing Spark-based data pipelines - are you up for it?
 
Scheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructureScheduling big data workloads on serverless infrastructure
Scheduling big data workloads on serverless infrastructure
 
GraphQL API on a Serverless Environment
GraphQL API on a Serverless EnvironmentGraphQL API on a Serverless Environment
GraphQL API on a Serverless Environment
 
Serverless data processing built for internet SCALE
Serverless data processing built for internet SCALEServerless data processing built for internet SCALE
Serverless data processing built for internet SCALE
 
Ask me anything - Women in Big Data Israel
Ask me anything - Women in Big Data IsraelAsk me anything - Women in Big Data Israel
Ask me anything - Women in Big Data Israel
 

Último

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 

Último (20)

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 

Funnel Analysis with Spark and Druid