SlideShare a Scribd company logo
1 of 8
Introducing SnowPlow




 A new approach to web analytics
A lot is wrong with web analytics today…
               • Focus on marketing-related analytics (visits, click-throughs, conversions)
  Narrow       • Focus on ecommerce sites. (Limited number of goals, limited set of clearly defined workflows e.g.
                 sign up to email, purchase product)
  focused
               • No analytics for SaaS based businesses, drivers of customer value, product analytics


               • Hard to perform analyses on users / customers that span multiple visits
               • Hard to examine the ways users actually engage on sites (esp. for SaaS / web apps), aggregate
 Inflexible      customer journeys
               • Hard to map and segment users based on their behaviour and customer journeys
               • Limited tools to pick out the root cause of differences in customer journey


  Too high
 level AND     • Too high level: impractical or impossible to zoom in on individual customers and events
  too low      • Too low level: hard to see the wood for the trees in a sea of data / pre-defined views
 level level


               • Hard to integrate with other sources of customer data including CRM, email marketing, social
                 marketing, customer service, financial systems ad serving systems
   Siloed      • Typically separated from other business intelligence system, with each system used to answer
                 different types of business questions
…with bad consequences for businesses
                                                                     Hard to export web analytics data to answer
    Cannot answer important business questions
                                                                              questions in other systems

•   Questions related to the customer base                       •   Two reasons to export our data:
      –   Who are our most valuable customers?                         –   So that we can answer business questions using this
      –   How can I spot them in advance?                                  data in another (more appropriate) system
      –   What are the “sliding doors” moments in a customer’s         –   So that we can use this data in other value generating
          journey that impact their future value?                          ways e.g. drive product / content
      –   How does our customer base break down, by                        recommendation, service personalisation
          behaviour?                                             •   Sometimes impossible,
      –   How well do I serve each segment?                            –   Impossible to export granular data out of Google
      –   How well do I monetize each segment?                             Analytics
      –   Where are the best opportunities for growing the       •   Otherwise expensive
          value of my customer base?                                   –   Enterprise web analytics products charge for export
•   Product development questions                                          based on data volumes, making export expensive for
                                                                           large data sets
      –   How successful has each product iteration been at
          driving user engagement?                               •   Hard to house exported data
      –   Does our product work better for some customer               –   Web analytics systems generate big data volumes of
          segments than others? If so, why?                                data, which can be costly to warehouse and query
      –   Does our product work better at some parts of the
          customer journey than others? Where?
      –   Where should we focus product development efforts?
SnowPlow takes a radically new approach to web
analytics…
          Traditional approach           SnowPlow approach



   1. What reports               1. What is all the
      do we want                    available data
      to deliver?                   that we could
                                    ever want?




   2. What data do               2. What tools will
      we collect to                 empower our
      support those                 analysts to
      reports?                      answer any
                                    possible biz Q?
…one that starts from the principal of having all the
data

 Capture all data    •   All data is captured via easy-to-implement JavaScript tags
                     •   Light-weight event tracking makes it easy to capture any type of online behaviour
                     •   No limits on the number, type or categories of events or variables that can be assigned
                     •   Data is stored in Amazon S3 for scalability
                     •   Data can be enriched from other 1st and 3rd party sources. (Data can be exported and imported)



 Complete data
   ownership         • Data capture is via 1st party cookies
                     • Javascript tracking and ETL source code is open source
                     • All data is stored in SnowPlow users’ own Amazon S3 accounts




    Powerful
                     • Latest big data and cloud computing technologies for data storage and querying
 analytics toolset
                     • Data is queried using Facebook-developed Apache Hive via Elastic MapReduce, making it easy to run
                       queries against enormous data sets
                     • Possible to run any big data analytics toolset (e.g. Mahout, Cascalog, Microstrategy) on SnowPlow data
To date, SnowPlow users can query data using Apache
Hive, which is great for analysts but bad for business users
            Hive is a datawarehousing platform                    SnowPlow data is stored in a single Hive table
            Built on top of Hadoop: scalable                      Each line of data represents one event (e.g.
            Developed at Facebook, but now widely used            page view, add-to-basket, video play, ad view
            at e.g. Netflix, OpenX, The Globe and Mail.           etc)
            Enables analysts to query data using SQL              Each line of data includes a user_id and visit_id




                        Pros                                                       Cons
 • Easy for anyone with SQL knowledge to run queries      • Command-line interface not suitable for many
 • Straightforward to aggregate data                        business people
 • Straightforward to ingest new data sources to          • No in-built data visualisation capability. (Have to
   enrich the web analytics data (e.g. CRM                  export data to a separate application)
   data, media catalogues)                                • KPI dashboards can be driven from Hive
 • Interactive UI allows for ad hoc query development       analysis, but always require the integration of
   sessions                                                 another application
 • Straightforward to export aggregated data sets
   into other tools
 • Possible to schedule jobs to populate e.g. KPI
   dashboard
Our priority now is to develop the toolset to answer
 business questions using all this analytics data

                                          SnowPlow web analytics data



                                                                                       Operational systems e.g.
    KPIs and standard reports                     Ad hoc analytics
                                                                                   recommendation engines, marketing

• Enable analysts to easily create and   • Enable analysts with more limited       • Use SnowPlow data in live systems
  distribute KPI dashboards and            SQL and programming knowledge             e.g. in-store product
  reports including on customer            to query data e.g. pivot tables, data     recommendation…
  lifetime value and cohort analysis       visualisation tools




                                                                                   • …or to send personalised marketing
                                         • Statistical and machine learning          to customers to drive up customer
• Reports will vary in scope e.g. for      tools to perform e.g. behavioural         satisfaction
  management team, marketing               segmentations of customer
  teams, product development team          base, predict likely customer                        Some of the analytics tools
  etc.                                     lifetime value                                      we develop will be offered as
                                                                                               cloud-based solutions, for a
                                                                                                  monthly subscription
Whilst many of the tools are not yet developed, we
recommend installing SnowPlow today

  •
  1   Start warehousing your web analytics data using SnowPlow today



  •
  2   Start using the already available (free, open source) tools, particularly Apache Hive, to drive
      insight from your user data today



  •
  3   Have a large data set ready for when our more business friendly analytics tools become
      available




      Download SnowPlow from Github                                   Contact Keplar LLP for support and
                                                                                 consultancy



      github.com/snowplow/snowplow                                            www.keplarllp.com

More Related Content

Viewers also liked (7)

BSA Scouting Heritage
BSA Scouting HeritageBSA Scouting Heritage
BSA Scouting Heritage
 
Robert Proctor Multisoft: Boy Scout Of America - The Scout Law
Robert Proctor Multisoft: Boy Scout Of America - The Scout LawRobert Proctor Multisoft: Boy Scout Of America - The Scout Law
Robert Proctor Multisoft: Boy Scout Of America - The Scout Law
 
Winter Driving Safety Training by USS Monterey
Winter Driving Safety Training by USS MontereyWinter Driving Safety Training by USS Monterey
Winter Driving Safety Training by USS Monterey
 
Winter Safety Briefing by SWRCAP
Winter Safety Briefing by SWRCAPWinter Safety Briefing by SWRCAP
Winter Safety Briefing by SWRCAP
 
Concrete Mix Design
Concrete Mix DesignConcrete Mix Design
Concrete Mix Design
 
Haywood Fitness - Presentation for Eagle Scouts Personal Fitness Merit Badge
Haywood Fitness - Presentation for Eagle Scouts Personal Fitness Merit BadgeHaywood Fitness - Presentation for Eagle Scouts Personal Fitness Merit Badge
Haywood Fitness - Presentation for Eagle Scouts Personal Fitness Merit Badge
 
PRINCIPLE OF CONCRETE MIX DESIGN
PRINCIPLE OF CONCRETE MIX DESIGNPRINCIPLE OF CONCRETE MIX DESIGN
PRINCIPLE OF CONCRETE MIX DESIGN
 

More from yalisassoon

Big data meetup budapest adding data schemas to snowplow
Big data meetup budapest   adding data schemas to snowplowBig data meetup budapest   adding data schemas to snowplow
Big data meetup budapest adding data schemas to snowplow
yalisassoon
 

More from yalisassoon (20)

Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
Snowplow at Sigfig
Snowplow at SigfigSnowplow at Sigfig
Snowplow at Sigfig
 
2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling
 
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Snowplow: putting digital analysts at the heart of digital analytics - the fo...Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
 
Snowplow the evolving data pipeline
Snowplow   the evolving data pipelineSnowplow   the evolving data pipeline
Snowplow the evolving data pipeline
 
Capturing online customer data to create better insights and targeted actions...
Capturing online customer data to create better insights and targeted actions...Capturing online customer data to create better insights and targeted actions...
Capturing online customer data to create better insights and targeted actions...
 
Yali presentation for snowplow amsterdam meetup number 2
Yali presentation for snowplow amsterdam meetup number 2Yali presentation for snowplow amsterdam meetup number 2
Yali presentation for snowplow amsterdam meetup number 2
 
Snowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcaseSnowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcase
 
Using Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMadeUsing Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMade
 
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
 
Modeling event data
Modeling event dataModeling event data
Modeling event data
 
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
The analytics journey at Viewbix - how they came to use Snowplow and the setu...The analytics journey at Viewbix - how they came to use Snowplow and the setu...
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
 
Snowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.comSnowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.com
 
Snowplow: where we came from and where we are going - March 2016
Snowplow: where we came from and where we are going - March 2016Snowplow: where we came from and where we are going - March 2016
Snowplow: where we came from and where we are going - March 2016
 
Snowplow is at the core of everything we do
Snowplow is at the core of everything we doSnowplow is at the core of everything we do
Snowplow is at the core of everything we do
 
Implementing improved and consistent arbitrary event tracking company-wide us...
Implementing improved and consistent arbitrary event tracking company-wide us...Implementing improved and consistent arbitrary event tracking company-wide us...
Implementing improved and consistent arbitrary event tracking company-wide us...
 
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event data
 
Modelling event data in look ml
Modelling event data in look mlModelling event data in look ml
Modelling event data in look ml
 
Big data meetup budapest adding data schemas to snowplow
Big data meetup budapest   adding data schemas to snowplowBig data meetup budapest   adding data schemas to snowplow
Big data meetup budapest adding data schemas to snowplow
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 

(Re-)introducing SnowPlow

  • 1. Introducing SnowPlow A new approach to web analytics
  • 2. A lot is wrong with web analytics today… • Focus on marketing-related analytics (visits, click-throughs, conversions) Narrow • Focus on ecommerce sites. (Limited number of goals, limited set of clearly defined workflows e.g. sign up to email, purchase product) focused • No analytics for SaaS based businesses, drivers of customer value, product analytics • Hard to perform analyses on users / customers that span multiple visits • Hard to examine the ways users actually engage on sites (esp. for SaaS / web apps), aggregate Inflexible customer journeys • Hard to map and segment users based on their behaviour and customer journeys • Limited tools to pick out the root cause of differences in customer journey Too high level AND • Too high level: impractical or impossible to zoom in on individual customers and events too low • Too low level: hard to see the wood for the trees in a sea of data / pre-defined views level level • Hard to integrate with other sources of customer data including CRM, email marketing, social marketing, customer service, financial systems ad serving systems Siloed • Typically separated from other business intelligence system, with each system used to answer different types of business questions
  • 3. …with bad consequences for businesses Hard to export web analytics data to answer Cannot answer important business questions questions in other systems • Questions related to the customer base • Two reasons to export our data: – Who are our most valuable customers? – So that we can answer business questions using this – How can I spot them in advance? data in another (more appropriate) system – What are the “sliding doors” moments in a customer’s – So that we can use this data in other value generating journey that impact their future value? ways e.g. drive product / content – How does our customer base break down, by recommendation, service personalisation behaviour? • Sometimes impossible, – How well do I serve each segment? – Impossible to export granular data out of Google – How well do I monetize each segment? Analytics – Where are the best opportunities for growing the • Otherwise expensive value of my customer base? – Enterprise web analytics products charge for export • Product development questions based on data volumes, making export expensive for large data sets – How successful has each product iteration been at driving user engagement? • Hard to house exported data – Does our product work better for some customer – Web analytics systems generate big data volumes of segments than others? If so, why? data, which can be costly to warehouse and query – Does our product work better at some parts of the customer journey than others? Where? – Where should we focus product development efforts?
  • 4. SnowPlow takes a radically new approach to web analytics… Traditional approach SnowPlow approach 1. What reports 1. What is all the do we want available data to deliver? that we could ever want? 2. What data do 2. What tools will we collect to empower our support those analysts to reports? answer any possible biz Q?
  • 5. …one that starts from the principal of having all the data Capture all data • All data is captured via easy-to-implement JavaScript tags • Light-weight event tracking makes it easy to capture any type of online behaviour • No limits on the number, type or categories of events or variables that can be assigned • Data is stored in Amazon S3 for scalability • Data can be enriched from other 1st and 3rd party sources. (Data can be exported and imported) Complete data ownership • Data capture is via 1st party cookies • Javascript tracking and ETL source code is open source • All data is stored in SnowPlow users’ own Amazon S3 accounts Powerful • Latest big data and cloud computing technologies for data storage and querying analytics toolset • Data is queried using Facebook-developed Apache Hive via Elastic MapReduce, making it easy to run queries against enormous data sets • Possible to run any big data analytics toolset (e.g. Mahout, Cascalog, Microstrategy) on SnowPlow data
  • 6. To date, SnowPlow users can query data using Apache Hive, which is great for analysts but bad for business users Hive is a datawarehousing platform SnowPlow data is stored in a single Hive table Built on top of Hadoop: scalable Each line of data represents one event (e.g. Developed at Facebook, but now widely used page view, add-to-basket, video play, ad view at e.g. Netflix, OpenX, The Globe and Mail. etc) Enables analysts to query data using SQL Each line of data includes a user_id and visit_id Pros Cons • Easy for anyone with SQL knowledge to run queries • Command-line interface not suitable for many • Straightforward to aggregate data business people • Straightforward to ingest new data sources to • No in-built data visualisation capability. (Have to enrich the web analytics data (e.g. CRM export data to a separate application) data, media catalogues) • KPI dashboards can be driven from Hive • Interactive UI allows for ad hoc query development analysis, but always require the integration of sessions another application • Straightforward to export aggregated data sets into other tools • Possible to schedule jobs to populate e.g. KPI dashboard
  • 7. Our priority now is to develop the toolset to answer business questions using all this analytics data SnowPlow web analytics data Operational systems e.g. KPIs and standard reports Ad hoc analytics recommendation engines, marketing • Enable analysts to easily create and • Enable analysts with more limited • Use SnowPlow data in live systems distribute KPI dashboards and SQL and programming knowledge e.g. in-store product reports including on customer to query data e.g. pivot tables, data recommendation… lifetime value and cohort analysis visualisation tools • …or to send personalised marketing • Statistical and machine learning to customers to drive up customer • Reports will vary in scope e.g. for tools to perform e.g. behavioural satisfaction management team, marketing segmentations of customer teams, product development team base, predict likely customer Some of the analytics tools etc. lifetime value we develop will be offered as cloud-based solutions, for a monthly subscription
  • 8. Whilst many of the tools are not yet developed, we recommend installing SnowPlow today • 1 Start warehousing your web analytics data using SnowPlow today • 2 Start using the already available (free, open source) tools, particularly Apache Hive, to drive insight from your user data today • 3 Have a large data set ready for when our more business friendly analytics tools become available Download SnowPlow from Github Contact Keplar LLP for support and consultancy github.com/snowplow/snowplow www.keplarllp.com