SlideShare una empresa de Scribd logo
1 de 28
Scalable Stream Processing and Map-Reduce ,[object Object],[object Object],eBay Research Labs
About eBay Research Labs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
[object Object],The Land of Large Numbers eBay Inc. Confidential eBay users trade about $1,400 worth of goods on the site every second. ,[object Object],[object Object],[object Object],[object Object],[object Object]
Listings eBay Inc. Confidential 546.4 Million 1999 2000 2001 2002 2003 1998 2004 2005
Registered Users eBay Inc. Confidential 180.6 Million 1999 2000 2001 2002 2003 1998 2004 2005
League of Long Tail ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
Of Needles, Haystacks, and  Magnets… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
The Log Challenge ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
What we need is.. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
Architecture - Mobius eBay Inc. Confidential Log Processing API NFS/HDFS Web Interface  Java M/R Template Click stream  Visualizer Logs Log Query Language Application  Layer Data Source Layer Cluster Layer Extension (mapper/reducer plugin) Hadoop Hybrid Hadoop Cluster Solaris Servers Solaris Servers Solaris Servers Windows Desktop Windows Desktop Windows Desktop
eBay Inc. Confidential ?% ?% ?% ?% ?% start search exit exit view purchase View through rate Buy rate
User session information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
What do we need in a Stream Query Language ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
Mobius Query Language (MQL)  -  Structure of a Query eBay Inc. Confidential SELECT  * FROM  source WHERE  where_condition PATTERN  pattern  WITH  condition START  start_condition END  end_condition FROM START END FILTER PATTERN SELECT Input stream
Simple Example eBay Inc. Confidential ,[object Object],[object Object],[object Object],(1,11:00AM) (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . inputStream
Simple Example Contd… eBay Inc. Confidential SELECT  size(A), B FROM  inputStream WHERE  $0 > 0 PATTERN  ( ($0 < 5) +  AS  A[ ] //  $0 > 5  AS  B ) WITH  A[0].time  + 5 min > B.time  && size(A) < 3 START  $1 > 11:01AM END  $0 < 0 FROM START END FILTER PATTERN SELECT Input Stream (1,11:00AM) (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . (1,11:00AM) (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) ({(2,11:02AM)}, (6,11:04AM)) ({(3,11:06AM)},(7,11:07AM) ) (1, (6,11:04AM)) (1, (7,11:07AM) ) Output: Start-end Sub-stream Filter Pattern Tuple in a bag
Other Systems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
Pattern Query ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential Pattern Query search * view
Recommender Systems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential
Sessionized Analysis 1 eBay Inc. Confidential X % Y % X1 % X2 % X3 % X4 % Y1 % Y2 % Y3 % Y4 % Start Algo1 Algo2 At least  one view Left eBay w/o View Left eBay w/o View At least  one view At least  one purchase no purchase  were made no purchase  were made At least  one purchase View through rate Buy rate
Step1 : Extract all sessions with “view”s ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential recommend * view Bag named session Bag named viewedRecos A
Step 2: Extract all instances of recommendations made… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential B
Step 3: Produce a flattened view ,[object Object],[object Object],[object Object],eBay Inc. Confidential C
Step 4: Group the data by algorithm type and compute counts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential D
Parallel Implementation ,[object Object],[object Object],[object Object],eBay Inc. Confidential From Filter Select Group Load Save Select Scatter Gather Gather Scatter select select Pattern B A D C
Understanding System Health ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential Time of the day % load 60% t
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential load<60 load > 60 load<60
Ongoing and Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],eBay Inc. Confidential

Más contenido relacionado

Similar a Hw09 Analytics And Reporting

AnDevCon - Tracking User Behavior Creatively
AnDevCon - Tracking User Behavior CreativelyAnDevCon - Tracking User Behavior Creatively
AnDevCon - Tracking User Behavior CreativelyKiana Tennyson
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataAbhishek M Shivalingaiah
 
Adopting F# at SBTech
Adopting F# at SBTechAdopting F# at SBTech
Adopting F# at SBTechAntya Dev
 
Operational Intelligence with WSO2 BAM
Operational Intelligence with WSO2 BAM Operational Intelligence with WSO2 BAM
Operational Intelligence with WSO2 BAM WSO2
 
Why Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks InfotechWhy Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks InfotechEBriks Infotech Pvt. Ltd.
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
Django Introduction Osscamp Delhi September 08 09 2007 Mir Nazim
Django Introduction Osscamp Delhi September 08 09 2007 Mir NazimDjango Introduction Osscamp Delhi September 08 09 2007 Mir Nazim
Django Introduction Osscamp Delhi September 08 09 2007 Mir NazimMir Nazim
 
Making the Most of Customer Data
Making the Most of Customer DataMaking the Most of Customer Data
Making the Most of Customer DataWSO2
 
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Todd Whitehead
 
Building Social Enterprise with Ruby and Salesforce
Building Social Enterprise with Ruby and SalesforceBuilding Social Enterprise with Ruby and Salesforce
Building Social Enterprise with Ruby and SalesforceRaymond Gao
 
Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Johann de Boer
 
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014Amazon Web Services
 
NZYP Project Casestudy using SilverStripe CMS
NZYP Project Casestudy using SilverStripe CMSNZYP Project Casestudy using SilverStripe CMS
NZYP Project Casestudy using SilverStripe CMSCam Findlay
 
Detective Controls: Gain Visibility and Record Change:
Detective Controls: Gain Visibility and Record Change: Detective Controls: Gain Visibility and Record Change:
Detective Controls: Gain Visibility and Record Change: Amazon Web Services
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Sease
 
[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming Apps[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming AppsWSO2
 
Why Analytics important for any business - EBriks Infotech
 Why Analytics important for any business - EBriks Infotech Why Analytics important for any business - EBriks Infotech
Why Analytics important for any business - EBriks InfotechEBriks Infotech Pvt. Ltd.
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...ForgeRock
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Stamatis Zampetakis
 

Similar a Hw09 Analytics And Reporting (20)

AnDevCon - Tracking User Behavior Creatively
AnDevCon - Tracking User Behavior CreativelyAnDevCon - Tracking User Behavior Creatively
AnDevCon - Tracking User Behavior Creatively
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
 
Adopting F# at SBTech
Adopting F# at SBTechAdopting F# at SBTech
Adopting F# at SBTech
 
Operational Intelligence with WSO2 BAM
Operational Intelligence with WSO2 BAM Operational Intelligence with WSO2 BAM
Operational Intelligence with WSO2 BAM
 
Why Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks InfotechWhy Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks Infotech
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
Django Introduction Osscamp Delhi September 08 09 2007 Mir Nazim
Django Introduction Osscamp Delhi September 08 09 2007 Mir NazimDjango Introduction Osscamp Delhi September 08 09 2007 Mir Nazim
Django Introduction Osscamp Delhi September 08 09 2007 Mir Nazim
 
Making the Most of Customer Data
Making the Most of Customer DataMaking the Most of Customer Data
Making the Most of Customer Data
 
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
 
Building Social Enterprise with Ruby and Salesforce
Building Social Enterprise with Ruby and SalesforceBuilding Social Enterprise with Ruby and Salesforce
Building Social Enterprise with Ruby and Salesforce
 
Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015
 
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
 
NZYP Project Casestudy using SilverStripe CMS
NZYP Project Casestudy using SilverStripe CMSNZYP Project Casestudy using SilverStripe CMS
NZYP Project Casestudy using SilverStripe CMS
 
Appengine Nljug
Appengine NljugAppengine Nljug
Appengine Nljug
 
Detective Controls: Gain Visibility and Record Change:
Detective Controls: Gain Visibility and Record Change: Detective Controls: Gain Visibility and Record Change:
Detective Controls: Gain Visibility and Record Change:
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
 
[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming Apps[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming Apps
 
Why Analytics important for any business - EBriks Infotech
 Why Analytics important for any business - EBriks Infotech Why Analytics important for any business - EBriks Infotech
Why Analytics important for any business - EBriks Infotech
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21
 

Más de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Más de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Hw09 Analytics And Reporting

  • 1.
  • 2.
  • 3.
  • 4. Listings eBay Inc. Confidential 546.4 Million 1999 2000 2001 2002 2003 1998 2004 2005
  • 5. Registered Users eBay Inc. Confidential 180.6 Million 1999 2000 2001 2002 2003 1998 2004 2005
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. Architecture - Mobius eBay Inc. Confidential Log Processing API NFS/HDFS Web Interface Java M/R Template Click stream Visualizer Logs Log Query Language Application Layer Data Source Layer Cluster Layer Extension (mapper/reducer plugin) Hadoop Hybrid Hadoop Cluster Solaris Servers Solaris Servers Solaris Servers Windows Desktop Windows Desktop Windows Desktop
  • 11. eBay Inc. Confidential ?% ?% ?% ?% ?% start search exit exit view purchase View through rate Buy rate
  • 12.
  • 13.
  • 14. Mobius Query Language (MQL) - Structure of a Query eBay Inc. Confidential SELECT * FROM source WHERE where_condition PATTERN pattern WITH condition START start_condition END end_condition FROM START END FILTER PATTERN SELECT Input stream
  • 15.
  • 16. Simple Example Contd… eBay Inc. Confidential SELECT size(A), B FROM inputStream WHERE $0 > 0 PATTERN ( ($0 < 5) + AS A[ ] // $0 > 5 AS B ) WITH A[0].time + 5 min > B.time && size(A) < 3 START $1 > 11:01AM END $0 < 0 FROM START END FILTER PATTERN SELECT Input Stream (1,11:00AM) (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . (1,11:00AM) (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) . . . (5,11:01AM) (2,11:02AM) (6,11:04AM) (10,11:05AM) (3,11:06AM) (7,11:07AM) (2,11:08AM) (-1,11:10AM) ({(2,11:02AM)}, (6,11:04AM)) ({(3,11:06AM)},(7,11:07AM) ) (1, (6,11:04AM)) (1, (7,11:07AM) ) Output: Start-end Sub-stream Filter Pattern Tuple in a bag
  • 17.
  • 18.
  • 19.
  • 20. Sessionized Analysis 1 eBay Inc. Confidential X % Y % X1 % X2 % X3 % X4 % Y1 % Y2 % Y3 % Y4 % Start Algo1 Algo2 At least one view Left eBay w/o View Left eBay w/o View At least one view At least one purchase no purchase were made no purchase were made At least one purchase View through rate Buy rate
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.

Notas del editor

  1. Diversity collapse, dot com collapse, and how the stock survived
  2. Grew 28% sequentially and 346% annually (Q299 vs. Q298)
  3. Economics Professor at Stanford
  4. In PIG need 2 functions (1) to check a session contains a search event. (2) to find view event for a search event.
  5. Need to write down schema of the data. Explain pattern. Also explain {}
  6. GROUP is like PIG Group
  7. GROUP is like PIG Group
  8. GROUP is like PIG Group