SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Yahoo! Product Intelligence
                                                       MicroStrategy World 2008


           Amr A. Awadallah, PhD
           VP Product Intelligence Engineering
           Jan 2008

Yahoo! Inc – Amr A. Awadallah – MicroStrategy World 2008
About Yahoo! and it’s purpose.

     Yahoo! who?
     Yahoo! is the world’s largest global online network of integrated
     services and is one of the most trafficked Internet destinations
     worldwide. For more than 12 years, Yahoo! has been changing
     the way people communicate with each other, conduct
     transactions, and access/share/create information.

     Our Purpose:
     Yahoo! powers and delights our communities of users,
     advertisers, and publishers – all of us united in creating
     indispensable experiences, and fueled by trust.

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -1-
Yahoo! Fast Facts
     Our Name:
     Yahoo! = Yet Another Hierarchical Officious Oracle!
     Dictionary definition= quot;rude, unsophisticated, uncouth”

     Employees:
     14,000 Yahoos worldwide

     Global presence:
     Operations in over 20 markets and regions around the world
     Available in over 20 languages to 477M users.

     Milestones:
     Founded in 1994
     Incorporated in 1995
     Public in 1996

     Proud to be:
     Fortune 500 Company
     Fortune ‘100 Best Company to Work For’
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -2-
Yahoo! Pillars and Strategic Objectives

        Three Strategic Pillars:                                  Three Strategic Objectives:

        •    Insights will be leveraged to                        •   Online starting point for most
             deliver 10x relevant experiences                         consumers
        •    Open, allow publishers to use our                    •   Platforms that attract the most
             content/services, and vice versa                         developers and publishers
        •    Partner-of-Choice                                    •   A must buy for most advertisers



       •     We believe the focus on relevance as a measure will create a unifying
             focus to our work and drive increased value in everything we do.
       •     We are building the largest content, services, and advertising exchange.
       •     Example Strategic Partners: eBay, AT&T, Comcast, Newspaper
             Consortium, Bebo, WebMD, Cars.com, Forbes.com, Ziff Davis Media, DivX,
             Hearst.


Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -3-
Product Intelligence Engineering (PIE)



            Continuously generate and
          leverage insights to maximize
             sustainable value created
           through interactions within the
                Yahoo! ecosystem

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -4-
Why Measure?




       • “If you can’t measure it, you can’t fix it”
       • “If you can’t measure it, you can’t grow it”
       • “If you can’t measure it, you can’t build it”




Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -5-
We Support More Than a Dozen Datamarts


       • Yahoo MY/Frontpage                                       • Flickr
       • Yahoo                            Search                  • Yahoo GrouP
       • Yahoo Mail                                               • answers
       • Yahoo Toolbar                                            • Yahoo News
       • Yahoo MesnGr                                             • Yahoo Financ
       • Yahoo Local                                              • Yahoo Travel
       • Yahoo video                                              • Yahoo ShoP
       • Yahoo SPorts                                             • Yahoo Ent
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -6-
Our Data Stack

                                             MicroStrategy Standard Reporting

                                            MicroStrategy Advanced Reporting

                                                 MicroStrategy Data Modeling

                                         Data Mart Database (currently Oracle)     200GB/day
                                          Data Mart ETL phase 2 (mixed tools)

                                   Data Mart ETL phase 1 (aggregations in grid)

                                       Foundational Warehouse (Link Octopus)       10TB/day
                                                            Warehouse ETL

                                                Log Collection (Data Highway)

                                       Instrumentation (Universal Link Tracking)


Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008        -7-
Groups That Use Our Systems

       • Business Operations/Finance
       • Product Management
       • Research and Development
       • User Experience and Design
       • Product Marketing
       • Advertising Sales
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -8-
Example Dashboards




Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -9-
Case Study: A/B testing for Shopping.yahoo.com
 Is it better to show items in a list or in a grid?



                                                                     Test Bucket A: List




 Test Bucket B: Grid

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 10 -
Dashboard Advice

   • Its very hard (impossible) to reach a single
     universal metric that summarizes how the
     product is doing
   • Proper visualization tools are very important
     since there are a lot of numbers to examine
   • Averages are nice, but histograms tell the real
     story
   • Trends are your friend, keep history, lots of it
   • Sampling is very dangerous if not used properly

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 11 -
Example Metrics & Dimensions

          Example Metrics:
          • Click Probability (Cookies Clicked/Cookies Visited)
          • Click Yield (Clicks Per Thousand Cookies)
          • RPC (Revenue Per Cookie)
          • Sessions/Cookie/Week (or Month)
          • Time/Cookie/Week (or Month)
          • Retention Rate (percent of Cookies that returned)
          Example Dimensions:
          • Demographics (Gender, Age, Income, Tenure)
          • Geographics (Country, State, Zip, DMA)
          • Content ID
          • Access Modality (PC, PDA, Cell Phone, Net Speed, Browser, OS)
          • Traffic Source (Organic, SEM, Affiliate, Marketing Campaign)
          • Bucket Test ID
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 12 -
Evolution of Our BI systems

  • Started with grep ☺ and generated
    static html dashboards
  • Evolved to load a few aggregates into
    MySQL with a dynamic Perl dashboard
  • Today we load a lot of aggregates for
    many metrics/dimensions into Oracle
    and use MicroStrategy for reporting.
  • Next?
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 13 -
Next: A Unified Datamart (aka EDW)

   • We currently have 18+ separate datamart
     silos across all of Yahoo!’s products.
   • We will bring all of these datamarts under
     the same umbrella so that we can easily do
     cross-Yahoo! analytics in the datamart.
   • The data-model will need to be a hybrid
     data-model that supports horizontal
     uniformity but allows for vertically deep
     metrics and dimensions based on the
     product (e.g. mail sends is unique to mail)
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 14 -
Next: Click and View-stream Analytics
  • Our current datamarts have aggregate data only
    which limits the number of questions that can be
    asked (we can still answer these questions from
    the LinkOctopus warehouse, but this requires an
    engineer to develop SQL due to complex schema).
  • We will expand our datamarts to include event-
    level data (both click and view-stream events), this
    will cause a large explosion in size and number
    of rows (from 200GB/day to 10TB/day).
  • The data-model will need to be a hybrid data-
    model that supports event level data but also
    aggregates (for performance and longevity)
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 15 -
Challenges we have


     • Load data before new business day starts
     • Operational stability
     • Data quality: Bot filtering, Cookie churn
     • Instrumentation Automation
     • Columnar access control
     • Scalable dimension, segmentation, and
       event-level processing

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 16 -
MicroStrategy: Things we like, Things we want


       Things We Like:                                         Things We Want:
       • It writes SQL for us ☺                                • Cross-mart dashboarding
       • It creates Web GUIs ☺                                 • URL functionality to send
       • We love the new flash                                   out report links
         functionality                                         • Better Portal SDK
       • We like Personalized  • Clickstream visualization
         Page Execution in NCS • Better NCS Debugging

                                                               • Intelligent Prompts
                                                               • Better search on
                                                                 support.microstrategy.com
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 17 -

Más contenido relacionado

La actualidad más candente

Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
Hortonworks
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015
Jongwook Woo
 

La actualidad más candente (20)

Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry Perspective
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
 
Introduction to Big data & Hadoop -I
Introduction to Big data & Hadoop -IIntroduction to Big data & Hadoop -I
Introduction to Big data & Hadoop -I
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop Introduction
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
 
Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoop
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Big data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureBig data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructure
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop Basics
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure Considerations
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and More
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015
 

Similar a Yahoo Microstrategy 2008

[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
Scott Abel
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
IntelAPAC
 

Similar a Yahoo Microstrategy 2008 (20)

Web 2.0 Managerial Economics
Web 2.0 Managerial EconomicsWeb 2.0 Managerial Economics
Web 2.0 Managerial Economics
 
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri AmbatiH2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
 
Fostering An Open Alliance Among Competitors The Itanium Solutions Alliance
Fostering An Open Alliance Among Competitors   The Itanium Solutions AllianceFostering An Open Alliance Among Competitors   The Itanium Solutions Alliance
Fostering An Open Alliance Among Competitors The Itanium Solutions Alliance
 
An Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyAn Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech Company
 
OneSpring: 5 Myths of Rich Internet Applications
OneSpring:  5 Myths of Rich Internet ApplicationsOneSpring:  5 Myths of Rich Internet Applications
OneSpring: 5 Myths of Rich Internet Applications
 
An Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyAn Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech Company
 
IoT and sustainable development - United Nations
IoT and sustainable development - United NationsIoT and sustainable development - United Nations
IoT and sustainable development - United Nations
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology Webinar
 
01. Portal Business Overview
01. Portal Business Overview01. Portal Business Overview
01. Portal Business Overview
 
Where is the S in SOA?
Where is the S in SOA?Where is the S in SOA?
Where is the S in SOA?
 
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
 
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
 
Analyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryAnalyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation Library
 
Presentación ComsCore Internet en SudAmerica y Argentina
Presentación ComsCore Internet en SudAmerica y ArgentinaPresentación ComsCore Internet en SudAmerica y Argentina
Presentación ComsCore Internet en SudAmerica y Argentina
 
Apache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformApache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platform
 
Convergence 2
Convergence 2Convergence 2
Convergence 2
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business Managers
 
Designing a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDesigning a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science Strategy
 
Ms Emerging Tech2008
Ms Emerging Tech2008Ms Emerging Tech2008
Ms Emerging Tech2008
 

Último

Último (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Yahoo Microstrategy 2008

  • 1. Yahoo! Product Intelligence MicroStrategy World 2008 Amr A. Awadallah, PhD VP Product Intelligence Engineering Jan 2008 Yahoo! Inc – Amr A. Awadallah – MicroStrategy World 2008
  • 2. About Yahoo! and it’s purpose. Yahoo! who? Yahoo! is the world’s largest global online network of integrated services and is one of the most trafficked Internet destinations worldwide. For more than 12 years, Yahoo! has been changing the way people communicate with each other, conduct transactions, and access/share/create information. Our Purpose: Yahoo! powers and delights our communities of users, advertisers, and publishers – all of us united in creating indispensable experiences, and fueled by trust. Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -1-
  • 3. Yahoo! Fast Facts Our Name: Yahoo! = Yet Another Hierarchical Officious Oracle! Dictionary definition= quot;rude, unsophisticated, uncouth” Employees: 14,000 Yahoos worldwide Global presence: Operations in over 20 markets and regions around the world Available in over 20 languages to 477M users. Milestones: Founded in 1994 Incorporated in 1995 Public in 1996 Proud to be: Fortune 500 Company Fortune ‘100 Best Company to Work For’ Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -2-
  • 4. Yahoo! Pillars and Strategic Objectives Three Strategic Pillars: Three Strategic Objectives: • Insights will be leveraged to • Online starting point for most deliver 10x relevant experiences consumers • Open, allow publishers to use our • Platforms that attract the most content/services, and vice versa developers and publishers • Partner-of-Choice • A must buy for most advertisers • We believe the focus on relevance as a measure will create a unifying focus to our work and drive increased value in everything we do. • We are building the largest content, services, and advertising exchange. • Example Strategic Partners: eBay, AT&T, Comcast, Newspaper Consortium, Bebo, WebMD, Cars.com, Forbes.com, Ziff Davis Media, DivX, Hearst. Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -3-
  • 5. Product Intelligence Engineering (PIE) Continuously generate and leverage insights to maximize sustainable value created through interactions within the Yahoo! ecosystem Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -4-
  • 6. Why Measure? • “If you can’t measure it, you can’t fix it” • “If you can’t measure it, you can’t grow it” • “If you can’t measure it, you can’t build it” Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -5-
  • 7. We Support More Than a Dozen Datamarts • Yahoo MY/Frontpage • Flickr • Yahoo Search • Yahoo GrouP • Yahoo Mail • answers • Yahoo Toolbar • Yahoo News • Yahoo MesnGr • Yahoo Financ • Yahoo Local • Yahoo Travel • Yahoo video • Yahoo ShoP • Yahoo SPorts • Yahoo Ent Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -6-
  • 8. Our Data Stack MicroStrategy Standard Reporting MicroStrategy Advanced Reporting MicroStrategy Data Modeling Data Mart Database (currently Oracle) 200GB/day Data Mart ETL phase 2 (mixed tools) Data Mart ETL phase 1 (aggregations in grid) Foundational Warehouse (Link Octopus) 10TB/day Warehouse ETL Log Collection (Data Highway) Instrumentation (Universal Link Tracking) Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -7-
  • 9. Groups That Use Our Systems • Business Operations/Finance • Product Management • Research and Development • User Experience and Design • Product Marketing • Advertising Sales Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -8-
  • 10. Example Dashboards Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -9-
  • 11. Case Study: A/B testing for Shopping.yahoo.com Is it better to show items in a list or in a grid? Test Bucket A: List Test Bucket B: Grid Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 10 -
  • 12. Dashboard Advice • Its very hard (impossible) to reach a single universal metric that summarizes how the product is doing • Proper visualization tools are very important since there are a lot of numbers to examine • Averages are nice, but histograms tell the real story • Trends are your friend, keep history, lots of it • Sampling is very dangerous if not used properly Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 11 -
  • 13. Example Metrics & Dimensions Example Metrics: • Click Probability (Cookies Clicked/Cookies Visited) • Click Yield (Clicks Per Thousand Cookies) • RPC (Revenue Per Cookie) • Sessions/Cookie/Week (or Month) • Time/Cookie/Week (or Month) • Retention Rate (percent of Cookies that returned) Example Dimensions: • Demographics (Gender, Age, Income, Tenure) • Geographics (Country, State, Zip, DMA) • Content ID • Access Modality (PC, PDA, Cell Phone, Net Speed, Browser, OS) • Traffic Source (Organic, SEM, Affiliate, Marketing Campaign) • Bucket Test ID Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 12 -
  • 14. Evolution of Our BI systems • Started with grep ☺ and generated static html dashboards • Evolved to load a few aggregates into MySQL with a dynamic Perl dashboard • Today we load a lot of aggregates for many metrics/dimensions into Oracle and use MicroStrategy for reporting. • Next? Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 13 -
  • 15. Next: A Unified Datamart (aka EDW) • We currently have 18+ separate datamart silos across all of Yahoo!’s products. • We will bring all of these datamarts under the same umbrella so that we can easily do cross-Yahoo! analytics in the datamart. • The data-model will need to be a hybrid data-model that supports horizontal uniformity but allows for vertically deep metrics and dimensions based on the product (e.g. mail sends is unique to mail) Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 14 -
  • 16. Next: Click and View-stream Analytics • Our current datamarts have aggregate data only which limits the number of questions that can be asked (we can still answer these questions from the LinkOctopus warehouse, but this requires an engineer to develop SQL due to complex schema). • We will expand our datamarts to include event- level data (both click and view-stream events), this will cause a large explosion in size and number of rows (from 200GB/day to 10TB/day). • The data-model will need to be a hybrid data- model that supports event level data but also aggregates (for performance and longevity) Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 15 -
  • 17. Challenges we have • Load data before new business day starts • Operational stability • Data quality: Bot filtering, Cookie churn • Instrumentation Automation • Columnar access control • Scalable dimension, segmentation, and event-level processing Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 16 -
  • 18. MicroStrategy: Things we like, Things we want Things We Like: Things We Want: • It writes SQL for us ☺ • Cross-mart dashboarding • It creates Web GUIs ☺ • URL functionality to send • We love the new flash out report links functionality • Better Portal SDK • We like Personalized • Clickstream visualization Page Execution in NCS • Better NCS Debugging • Intelligent Prompts • Better search on support.microstrategy.com Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 17 -