SlideShare una empresa de Scribd logo
1 de 28
Choosing	
  a	
  Big	
  Data	
  Technology	
  
Stack	
  for	
  Digital	
  Marke7ng	
  




  Gary	
  Angel	
               Krishnan	
  Parasuraman	
  
  President	
  and	
  CTO	
     CTO,	
  IBM	
  Big	
  Data	
  Solutions	
  
Your	
  Hosts
                             	
  
    Gary Angel, Semphonic President and Co-Founder
    20+ years experience with BI & database marketing
    15 years experience with digital measurement
    Leading industry expert, speaker, blogger and Semphonic practice
     leader for advanced analytics
    Selected: Digital Analytics Association (formerly WAA) Most Influential
     Industry Contributor: 2012




    Krishnan Parasuraman, CTO Big Data Solutions, IBM

    15+ years experience with Large scale distributed information systems

    Background in product development, consulting and technology management

    Leading authority on big data technologies such as massively parallel data
     warehousing and Hadoop

    Author of the book Harness the Power of Big Data
Talking	
  Points	
  for	
  today’s	
  discussion	
  

•  Challenges	
  with	
  Digital	
  Marke7ng	
  and	
  Analy7cs.	
  What	
  
   makes	
  this	
  problem	
  so	
  unique	
  and	
  different?	
  

•  Why	
  is	
  it	
  hard	
  to	
  use	
  tradi7onal	
  database	
  technologies	
  to	
  
   analyze	
  Digital	
  Data?	
  

•  What	
  type	
  of	
  framework	
  would	
  we	
  use	
  to	
  evaluate	
  and	
  
   select	
  the	
  right	
  technology	
  stack	
  for	
  Digital	
  Analy7cs	
  need?	
  

•  How	
  does	
  IBM’s	
  stack	
  address	
  the	
  needs	
  of	
  Digital	
  
   Marke7ng?	
  
Introducing	
  Semphonic
                                                        	
  

    Founded in 1997 and exclusively focused on digital measurement and digital customer analytics

    Deep expertise in traditional Web analytics solutions (Omniture, GA Premium, IBM, etc.) AND in the
     use of advanced technologies for warehousing, integrating, and analyzing digital data.


    Practice focused on high-end customer analytics including:
           Digital Segmentation
           Site optimization and Personalization
           Customer Analytics
           Attribution Analysis & Media Mix Modeling
           Digital Data Models for the Warehouse
Two	
  Worlds	
  Divided	
  


          BI	
  &	
  Customer	
  
                Analy3cs	
                             Web	
  Analy3cs	
  
                                                        and	
  Digital	
  

•  Tradi7onal	
  BI	
  and	
  Customer	
  Analy7cs	
  teams	
  have	
  deep	
  methods	
  
   and	
  powerful	
  tools.	
  But	
  digital	
  data	
  is	
  surprisingly	
  different	
  and	
  
   challenging.	
  
•  Digital	
  Measurement	
  professionals	
  lack	
  the	
  tools,	
  the	
  
   methodology,	
  and	
  the	
  exper7se	
  to	
  do	
  mul7-­‐channel	
  analy7cs.	
  
Our	
  Goal	
  is	
  to	
  Bring	
  Them	
  Together
                                                            	
  




         Statistical Models
Proven             Actionable
                                        Online Behavior
  Demographics        Email Marketing

      Database	
                                      Web	
  	
  
      Marke3ng	
                                    Analy3cs	
  
Database-Driven      Event Driven          Social

      Old                                       SaaS
                List Enhancement

         Customer Driven
Digital	
  Analy7cs	
  	
  
in	
  a	
  Big-­‐Data	
  World	
  
Digital	
  Analy7cs	
  is	
  a	
  Paradigm	
  Big	
  Data	
  Applica7on	
  
Digital	
  Measurement	
  is	
  a	
  paradigm	
  case	
  of	
  big-­‐data:	
  
•  Lot’s	
  of	
  data	
  
      –  Millions	
  (hundreds	
  of?)	
  events	
  per	
  day	
  
      –  Lots	
  of	
  data	
  per	
  event	
  

•  Lot’s	
  of	
  key	
  High	
  Cardinality	
  variables	
  	
  
      –  Page	
  Name,	
  Product	
  Sets,	
  Referrers,	
  Campaigns,	
  Keywords	
  

      –  and	
  Customers	
  

•  Focus	
  on	
  Detail-­‐Level	
  Analy7cs:	
  
      –  Customer	
  Life7me	
  Value	
  
      –  Full	
  mul7-­‐touch	
  aYribu7on	
  

•  Lack	
  of	
  meaning	
  at	
  the	
  Row-­‐Level	
  
      –  In	
  digital,	
  meaning	
  exists	
  in	
  a	
  collec7on	
  of	
  records.	
  
Why	
  it’s	
  Challenging	
  
These	
  unique	
  aspects	
  of	
  digital	
  data	
  make	
  it	
  difficult	
  for	
  most	
  tradi3onal	
  
technology	
  stacks	
  to	
  support	
  effec3ve	
  digital	
  measurement	
  and	
  analysis:	
  

  Large	
  Row	
            Defeats	
  systems	
  not	
  setup	
  to	
  op3mize	
  full	
  table	
  scans	
  
  Volumes	
                 ONen	
  creates	
  unmanageable	
  indexing	
  sizes	
  	
  
                            Creates	
  basic	
  load	
  and	
  availability	
  issues	
  

  High	
                    Defeats	
  many	
  classic	
  OLAP	
  strategies	
  	
  
  Cardinality	
  
                            Forces	
  full-­‐table	
  or	
  index	
  scans	
  of	
  the	
  data	
  


  Focus	
  on	
             Defeats	
  aggrega3on	
  strategies	
  
  Detail	
  Level	
  
  Analy3cs	
                No	
  opportunity	
  for	
  fixed	
  aggregates	
  to	
  succeed	
  


  Lack	
  of	
              Defeats	
  simple	
  aggrega3on	
  strategies	
  
  Meaning	
  at	
  
  the	
  Row	
  Level	
     Defeats	
  tradi3onal	
  row-­‐based	
  ETL	
  
Digital	
  Data:	
  Meaning	
  &	
  
Integra7on	
  
Why	
  Digital	
  Data	
  IS	
  DIFFERENT	
  

•  Here’s	
  why	
  your	
  tradi7onal	
  BI	
  and	
  Customer	
  Analy7cs	
  folks	
  
   struggle	
  with	
  Digital:	
  
       –  There	
  are	
  no	
  domain	
  experts	
  
       –  Nearly	
  all	
  	
  digital	
  data	
  is	
  stream	
  data	
  
       –  Unlike	
  transac7on	
  data,	
  digital	
  streams	
  don’t	
  aggregate	
  cleanly	
  

       –  Digital	
  Data	
  o]en	
  contains	
  a	
  hidden	
  topographic	
  structure	
  


                                        Most	
  modeling	
  
Tradi7onal	
  data	
                                                            Unlike	
  
                                          and	
  analysis	
  
modeling	
  relied	
                                                         Transac7on	
                Hidden	
  
                                        systems	
  provide	
  
   on	
  domain	
                                                            data,	
  digital	
      Structure	
  skews	
  
                                           row-­‐based	
  
 experts.	
  These	
                                                         stream	
  data	
        Basic	
  Sta7s7cal	
  
                                            analysis.	
  
  don’t	
  exist	
  in	
                                                        doesn’t	
                Analysis   	
  
                                        Analy7cs	
  data	
  is	
  
     digital   	
                                                             aggregate     	
  
                                          stream	
  data.  	
  
Talking	
  Streams	
  (More	
  detail	
  
because	
  this	
  is	
  hard	
  to	
  convey)	
  
Aggrega7on	
  of	
  Streams
                                                                         	
  

•  Aggrega7on	
  of	
  streams	
  is	
  cri7cal	
  to	
  effec7ve	
  digital	
  measurement	
  
     –    This	
  isn’t	
  because	
  of	
  performance	
  (though	
  it	
  helps)	
  

     –    Digital	
  data	
  has	
  meaning	
  as	
  a	
  collec7on	
  not	
  a	
  single	
  row	
  

     –    So	
  no	
  maYer	
  how	
  powerful	
  your	
  processing	
  system,	
  you	
  need	
  to	
  understand	
  whole	
  sequences	
  of	
  
          behavior	
  	
  

•  Tradi7onal	
  aggrega7on	
  doesn’t	
  work:	
  


                       Transaction                                                                        Page View




              Total Transactions                                                                       Total Page Views
Why	
  Streams	
  MaYer
                                                               	
  


•  The	
  single	
  biggest	
  driver	
  of	
  digital	
  analy7cs	
  measurement	
  is	
  the	
  
   need	
  to	
  de-­‐silo	
  data.	
  	
  
•  Proper	
  answers	
  to	
  ALL	
  of	
  these	
  ques7ons	
  require	
  mul7-­‐channel	
  
   data	
  integra7on.	
  	
  
      –  Where	
  do	
  mobile	
  apps	
  fit	
  in	
  the	
  broader	
  customer	
  journey?	
  
      –  How	
  does	
  web	
  engagement	
  translate	
  into	
  offline	
  sales?	
  
      –  How	
  do	
  my	
  best	
  offline	
  customers	
  use	
  the	
  digital	
  channel?	
  

      –  What’s	
  the	
  Predicted	
  Life7me	
  Value	
  of	
  a	
  Digital	
  Lead?	
  	
  
      –  What’s	
  the	
  value	
  of	
  a	
  Facebook	
  Fan?	
  
      –  What	
  impact	
  does	
  Posi7ve	
  Social	
  ChaYer	
  have	
  on	
  Brand	
  Affinity?	
  
      –  What	
  content	
  on	
  my	
  Website	
  is	
  most	
  effec7ve?	
  
A	
  Quick	
  Primer	
  on	
  Joins
                                  	
  


                             Joining	
  one	
  type	
  of	
  Customer	
  
                           Record	
  to	
  another	
  yields	
  a	
  single	
  
                            row	
  per	
  customer	
  with	
  an	
  easy	
  
                                to	
  use	
  combined	
  record.	
  




                             Joining	
  a	
  Customer	
  Record	
  to	
  
                              Geo	
  or	
  Census	
  data	
  yields	
  a	
  
                           single	
  row	
  per	
  customer	
  with	
  an	
  
                                     easy	
  to	
  use	
  record.	
  
And	
  Why	
  Streams	
  are	
  Pain	
  


                                               Combining	
  streams	
  like	
  digital	
  
                                               and	
  mobile	
  –	
  even	
  with	
  a	
  join	
  
                                                key	
  –	
  just	
  yields	
  two	
  dis3nct	
  
                                                 streams.	
  The	
  join	
  doesn’t	
  
                                                           simplify	
  analysis.	
  




    Pu[ng	
  mul3ple	
  digital	
  
   data	
  sources	
  on	
  the	
  same	
  
  box	
  WITH	
  join	
  keys	
  doesn’t	
  
   really	
  solve	
  the	
  problem.	
  
Every	
  Sub-­‐Channel	
  has	
  Dis7nct	
  Streams
                                                                  	
  

•  One	
  of	
  the	
  HUGE	
  challenges	
  facing	
  a	
  digital	
  technology	
  stack	
  is	
  
   that	
  almost	
  every	
  digital	
  source	
  is	
  quite	
  different.	
  
      –  One	
  of	
  the	
  most	
  common	
  failure	
  points	
  we	
  see	
  is	
  the	
  assump7on	
  that	
  pudng	
  
         the	
  data	
  in	
  one	
  place	
  makes	
  it	
  useful.	
  	
  
      –  Given	
  the	
  challenges	
  of	
  stream	
  analy7cs	
  in	
  a	
  single	
  sub-­‐channel,	
  asking	
  the	
  
         analyst	
  to	
  join	
  streams	
  on	
  un-­‐modified	
  data	
  is	
  overly-­‐op7mis7c.	
  	
  
      –  An	
  effec7ve	
  data	
  model	
  has	
  to	
  provide	
  a	
  means	
  of	
  unifying	
  sub-­‐channels	
  in	
  a	
  
         coherent	
  structure.	
  
Choosing	
  the	
  Right	
  Digital	
  
Technology	
  Stack	
  
Don’t	
  Have	
  a	
  Homer	
  Moment	
  (D’oh)
                                                                 	
  

•  Crea7ng	
  a	
  strong	
  founda7on	
  for	
  assessment	
  begins	
  with	
  your	
  
   business	
  purposes.	
  Each	
  of	
  these	
  puts	
  different	
  stresses	
  on	
  the	
  
   underlying	
  technology	
  stack:	
  



 Advanced	
  Web	
           Customer	
  
 Analy3cs	
                  Modeling	
                 Personaliza3on	
            Email	
  Targe3ng	
  




 Site	
                      Loyalty	
  Program	
       Merchandising	
             Enterprise	
  
 Personaliza3on	
            Analy3cs	
                 Analy3cs	
                  Dashboarding	
  




                             Social	
  Media	
          Opera3ons	
  (Call	
  
                             Analy3cs	
                 Avoidance,	
  etc.)	
  
Here	
  are	
  the	
  Key	
  Decision	
  Vectors	
  
                                                                  	
  

•  We’ve	
  matched	
  the	
  business	
  func7ons	
  to	
  the	
  following	
  key	
  
   aYributes	
  of	
  various	
  big	
  data	
  technology	
  stacks:	
  



                                              The	
  goal	
  is	
  to	
  help	
  you	
  assess	
  
                                             what	
  technology	
  trade-­‐offs	
  best	
  
                                                        fit	
  your	
  needs.	
  
Decision	
  Vectors
                                                                                     	
  

                  Advanced	
  Web	
  Analy7cs	
                                                             Advanced	
  Web	
  Analy7cs	
  &	
  Hadoop	
  


                                                                                                                                              Handling	
  Huge	
  
                                                                                                                                                 Volume	
  
                                                                                                                 Up3me/Load	
                   90	
                     Miminize	
  Data	
  
                                     Handling	
  Huge	
                                                                                         80	
  
                                        Volume	
                                                               Without	
  Disrup3on	
           70	
                       Modeling	
  
                  Up3me/Load	
         90	
                                                                                                     60	
  
                                       80	
                 Miminize	
  Data	
  
                    Without	
                                                                                                                   50	
  
                                       70	
                   Modeling	
                                    Minimize	
                                                                Easy	
  Data	
  
                   Disrup3on	
         60	
                                                                                                     40	
  
                                       50	
                                                               Administra3on	
                       30	
                                 Integra3on	
  
        Minimize	
                     40	
                            Easy	
  Data	
                                                           20	
  
      Administra3on	
                  30	
                           Integra3on	
                                                              10	
  
                                       20	
                                                                                                      0	
  
                                       10	
                                                                                                                                             Support	
  Integrated	
  
                                        0	
                             Support	
  Integrated	
     Real-­‐3me	
  Support	
  
                                                                                                                                                                                        Marke3ng	
  Solu3ons	
  
Real-­‐3me	
  Support	
                                                    Marke3ng	
  
                                                                            Solu3ons	
  
                Support	
                                                                               Support	
  Algorithmic	
  
                                                                                                                                                                                Support	
  BI	
  Tools	
  
              Algorithmic	
                                      Support	
  BI	
  Tools	
                    Queries	
  
                Queries	
  
                                                     Support	
  Stats	
                                                 Exper3se	
  Available	
                  Support	
  Stats	
  Tools	
  
                Exper3se	
  Available	
  
                                                        Tools	
  


                                                                                                                                                    Advanced	
  Web	
  Analy3cs	
  

                                                                                                                                                    Hadoop	
  
And	
  Here’s	
  a	
  Snapshot	
  of	
  the	
  Decision	
  Matrix	
  
Most	
  Common	
  Failure	
  Points
                                                             	
  

Here	
  are	
  some	
  common	
  risk	
  points:	
  

                                         • Insis3ng	
  on	
  Too	
  Much	
  History	
  
        Data	
  Windows	
  	
            • Using	
  a	
  single	
  technology	
  	
  

                                         • Keeping	
  too	
  much	
  data	
  
                 ETL	
                   • Missing	
  Join	
  Keys	
  

                                         • Failure	
  to	
  Reckon	
  with	
  Streams	
  
           Integra3on	
                  • Assump3on	
  that	
  a	
  key	
  is	
  all	
  that’s	
  necessary	
  

                                         • Ad	
  Hoc	
  Effort	
  instead	
  of	
  up-­‐front	
  segmenta3on	
  
            Analy3cs	
                   • Failure	
  to	
  understand	
  Topology	
  

                                         • Lack	
  of	
  structure	
  
  Data	
  Democra3za3on	
                • Tool	
  Complexity	
  

                                         • Single	
  Technology	
  Stack	
  
            Real-­‐3me	
                 • Unrealis3c	
  expecta3ons	
  
Digital	
  Analy7cs	
  is	
  a	
  Paradigm	
  Big	
  Data	
  Applica7on	
  
Digital	
  Measurement	
  is	
  a	
  paradigm	
  case	
  of	
  big-­‐
data:	
  
•  Lot’s	
  of	
  data	
  
      –  Millions	
  (hundreds	
  of?)	
  events	
  per	
  day	
  
      –  Lots	
  of	
  data	
  per	
  event	
  

•  Lot’s	
  of	
  key	
  High	
  Cardinality	
  variables	
  	
  
      –  Page	
  Name,	
  Product	
  Sets,	
  Referrers,	
  Campaigns,	
  
         Keywords	
  
      –  and	
  Customers	
  

•  Focus	
  on	
  Detail-­‐Level	
  Analy7cs:	
  
      –  Customer	
  Life7me	
  Value	
  
      –  Full	
  mul7-­‐touch	
  aYribu7on	
  

•  Lack	
  of	
  meaning	
  at	
  the	
  Row-­‐Level	
  
      –  In	
  digital,	
  meaning	
  exists	
  in	
  a	
  collec7on	
  of	
  records.	
  
Digital	
  Analy7cs	
  is	
  a	
  Paradigm	
  Big	
  Data	
  Applica7on	
  
Digital	
  Measurement	
  is	
  a	
  paradigm	
  case	
  of	
  big-­‐
data:	
  
•  Lot’s	
  of	
  data	
  
      –  Millions	
  (hundreds	
  of?)	
  events	
  per	
  day	
  
      –  Lots	
  of	
  data	
  per	
  event	
  

•  Lot’s	
  of	
  key	
  High	
  Cardinality	
  variables	
  	
  
      –  Page	
  Name,	
  Product	
  Sets,	
  Referrers,	
  Campaigns,	
  
         Keywords	
                                                                           BIG	
  DATA	
  
      –  and	
  Customers	
                                                                  PLATFORM	
  
•  Focus	
  on	
  Detail-­‐Level	
  Analy7cs:	
  
      –  Customer	
  Life7me	
  Value	
  
      –  Full	
  mul7-­‐touch	
  aYribu7on	
  

•  Lack	
  of	
  meaning	
  at	
  the	
  Row-­‐Level	
  
      –  In	
  digital,	
  meaning	
  exists	
  in	
  a	
  collec7on	
  of	
  records.	
  
The	
  Big	
  Data	
  Plaform	
  Requirements
                                                                                              	
  
                                                                                       Analyze	
  Extreme	
  Volumes	
  of	
  Data	
  
                         Impressions	
  
                                                                                       Online,	
  Offline,	
  Social,	
  Behavior,	
  First	
  Party	
  &	
  
                            Cookies	
                                                  Third	
  Party	
  across	
  mul3ple	
  channels	
  
 Online	
  




                          Registra3ons	
  
                   Purchase	
  Transac3ons	
                                            Analyze	
  Wide	
  Variety	
  of	
  Data	
  
                      In-­‐Market	
  Intent	
                                           Structured	
  –	
  POS,	
  3rd	
  Party,	
  Transac3ons	
  
                                                                                        Unstructured	
  –	
  Social,	
  Video,	
  Blogs	
  
                          Influence	
  
                                                                                        Semi-­‐Structured	
  –	
  Cookies,	
  Impressions	
  
                         Sen3ments	
  
                                                        BIG	
  DATA	
  
Social	
  




                          Followers	
  
                                                                                       Analyze	
  Data	
  in	
  Real	
  Time	
  
                     Recommenda3ons	
  
                              Likes	
                  PLATFORM	
                      Product	
  Recommenda3ons,	
  Real	
  Time	
  offers,	
  
                                                                                       Targeted	
  Ads	
  in	
  Real	
  Time	
  

                   Psychographic	
  surveys	
  
                     Geo-­‐Demographic	
                                                 Discover	
  &	
  Experiment	
  
3rd	
  Party	
  




                          Segments	
                                                     Ad-­‐hoc	
  analy3cs,	
  data	
  discovery	
  &	
  
                    Offline	
  Transac3ons	
                                               experimenta3on	
  

                          Responses	
  
                                                                                         Governance	
  
                                                                                         Enforce	
  data	
  structure,	
  integrity	
  and	
  
                                                                                         control	
  to	
  ensure	
  consistency	
  	
  
IBM’s	
  Big	
  Data	
  Plaform	
  

                         Impressions	
  
                                                                                                       Netezza	
  
                            Cookies	
                                                      •  Extreme	
  Performance	
  
 Online	
  




                          Registra3ons	
  
                                                                                           •  In-­‐Database	
  Analy3cs	
  
                   Purchase	
  Transac3ons	
  
                      In-­‐Market	
  Intent	
  
                                                                                           •  Scalable	
  Appliance	
  

                          Influence	
  
                         Sen3ments	
  
                                                                                                      Streams	
  
                                                   BIG	
  DATA	
  
Social	
  




                          Followers	
                                                      •  Act	
  on	
  Data	
  “In-­‐Mo3on”	
  
                     Recommenda3ons	
  
                              Likes	
             PLATFORM	
                               •  Real	
  3me	
  analy3cs	
  
                                                                                           •  Alerts/Ac3ons	
  
                   Psychographic	
  surveys	
  
                     Geo-­‐Demographic	
  
3rd	
  Party	
  




                          Segments	
  
                    Offline	
  Transac3ons	
  
                                                                                                   Big	
  Insights	
  
                          Responses	
                                                      •  Hadoop/	
  Unstructured	
  
                                                                                              Data	
  
                                                                                           •  Complex	
  Analy3cs	
  
Audience	
  Q&A	
  
Download	
  the	
  Full	
  Whitepaper	
  at:	
  
hrp://www.semphonic.com/a-­‐big-­‐data-­‐technology-­‐stack.html	
  

Learn	
  more	
  about	
  IBM’s	
  big	
  data	
  solu3ons	
  at:	
  
hrp://www.ibmbigdatahub.com	
  
hrp://www.analyzingmedia.com	
  

Más contenido relacionado

Destacado

áRbol De Problemas Y Soluciones 2
áRbol De Problemas Y Soluciones 2áRbol De Problemas Y Soluciones 2
áRbol De Problemas Y Soluciones 2lelia804
 
Conservacion del ambiente mapa
Conservacion del ambiente mapaConservacion del ambiente mapa
Conservacion del ambiente mapayainin27
 
Big data & Digital Marketing
Big data & Digital MarketingBig data & Digital Marketing
Big data & Digital MarketingKarthik Bharath
 
Digital Marketing Class: Big data
Digital Marketing Class: Big dataDigital Marketing Class: Big data
Digital Marketing Class: Big dataAlex Brown
 
Разработка надежных параллельных, распределенных приложений: быстро и дешево
Разработка надежных параллельных, распределенных приложений: быстро и дешевоРазработка надежных параллельных, распределенных приложений: быстро и дешево
Разработка надежных параллельных, распределенных приложений: быстро и дешевоDotNetConf
 
Protección y Conservación del Medio Ambiente
Protección y Conservación del Medio AmbienteProtección y Conservación del Medio Ambiente
Protección y Conservación del Medio AmbienteLuis Duran
 
Conservación del medio ambiente
Conservación del medio ambienteConservación del medio ambiente
Conservación del medio ambienteSamary Diaz
 
Pasteleria mony.pptx power
Pasteleria  mony.pptx powerPasteleria  mony.pptx power
Pasteleria mony.pptx poweralexitacuasquer
 
Emergence of Big Data in Digital Marketing
Emergence of Big Data  in Digital MarketingEmergence of Big Data  in Digital Marketing
Emergence of Big Data in Digital MarketingKrishnan Parasuraman
 
Optimizing Your Web Traffic: Turning Data and Insight into Actionable Market...
Optimizing Your  Web Traffic: Turning Data and Insight into Actionable Market...Optimizing Your  Web Traffic: Turning Data and Insight into Actionable Market...
Optimizing Your Web Traffic: Turning Data and Insight into Actionable Market...Alex Harris
 
Oracle Database Security: Top 10 Things You Could & Should Be Doing Differently
Oracle Database Security: Top 10 Things You Could & Should Be Doing DifferentlyOracle Database Security: Top 10 Things You Could & Should Be Doing Differently
Oracle Database Security: Top 10 Things You Could & Should Be Doing DifferentlyPythian
 
How Machine Learning is Shaping Digital Marketing
How Machine Learning is Shaping Digital MarketingHow Machine Learning is Shaping Digital Marketing
How Machine Learning is Shaping Digital Marketingindico data
 

Destacado (18)

Presentacion (1)
Presentacion (1)Presentacion (1)
Presentacion (1)
 
áRbol De Problemas Y Soluciones 2
áRbol De Problemas Y Soluciones 2áRbol De Problemas Y Soluciones 2
áRbol De Problemas Y Soluciones 2
 
moda
modamoda
moda
 
Conservacion del ambiente mapa
Conservacion del ambiente mapaConservacion del ambiente mapa
Conservacion del ambiente mapa
 
Big data & Digital Marketing
Big data & Digital MarketingBig data & Digital Marketing
Big data & Digital Marketing
 
Big Data of Digital Awesomeness
Big Data of Digital AwesomenessBig Data of Digital Awesomeness
Big Data of Digital Awesomeness
 
Digital Marketing Class: Big data
Digital Marketing Class: Big dataDigital Marketing Class: Big data
Digital Marketing Class: Big data
 
Разработка надежных параллельных, распределенных приложений: быстро и дешево
Разработка надежных параллельных, распределенных приложений: быстро и дешевоРазработка надежных параллельных, распределенных приложений: быстро и дешево
Разработка надежных параллельных, распределенных приложений: быстро и дешево
 
Protección y Conservación del Medio Ambiente
Protección y Conservación del Medio AmbienteProtección y Conservación del Medio Ambiente
Protección y Conservación del Medio Ambiente
 
Conservación del medio ambiente
Conservación del medio ambienteConservación del medio ambiente
Conservación del medio ambiente
 
Pasteleria mony.pptx power
Pasteleria  mony.pptx powerPasteleria  mony.pptx power
Pasteleria mony.pptx power
 
Estilo de vida y riesgo cardiovascular
Estilo de vida y riesgo cardiovascularEstilo de vida y riesgo cardiovascular
Estilo de vida y riesgo cardiovascular
 
Emergence of Big Data in Digital Marketing
Emergence of Big Data  in Digital MarketingEmergence of Big Data  in Digital Marketing
Emergence of Big Data in Digital Marketing
 
Optimizing Your Web Traffic: Turning Data and Insight into Actionable Market...
Optimizing Your  Web Traffic: Turning Data and Insight into Actionable Market...Optimizing Your  Web Traffic: Turning Data and Insight into Actionable Market...
Optimizing Your Web Traffic: Turning Data and Insight into Actionable Market...
 
Tritico comunicacion familiar
Tritico comunicacion familiarTritico comunicacion familiar
Tritico comunicacion familiar
 
Oracle Database Security: Top 10 Things You Could & Should Be Doing Differently
Oracle Database Security: Top 10 Things You Could & Should Be Doing DifferentlyOracle Database Security: Top 10 Things You Could & Should Be Doing Differently
Oracle Database Security: Top 10 Things You Could & Should Be Doing Differently
 
Pres eraa
Pres eraaPres eraa
Pres eraa
 
How Machine Learning is Shaping Digital Marketing
How Machine Learning is Shaping Digital MarketingHow Machine Learning is Shaping Digital Marketing
How Machine Learning is Shaping Digital Marketing
 

Más de Krishnan Parasuraman

Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...
Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...
Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...Krishnan Parasuraman
 
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...Krishnan Parasuraman
 
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...Krishnan Parasuraman
 
Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?Krishnan Parasuraman
 

Más de Krishnan Parasuraman (8)

The Revolution of Big Data
The Revolution of Big DataThe Revolution of Big Data
The Revolution of Big Data
 
Big Data Forum - Phoenix
Big Data Forum - PhoenixBig Data Forum - Phoenix
Big Data Forum - Phoenix
 
Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...
Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...
Extracting Big Value From Big Data in Digital Media - An Executive Webcast wi...
 
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
The New Age of Digital Marketing
The New Age of Digital MarketingThe New Age of Digital Marketing
The New Age of Digital Marketing
 
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
 
Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?Hadoop and Netezza - Co-existence or Competition?
Hadoop and Netezza - Co-existence or Competition?
 

Último

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Último (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Choosing a Big Data Technology Stack for Digital Marketing

  • 1. Choosing  a  Big  Data  Technology   Stack  for  Digital  Marke7ng   Gary  Angel   Krishnan  Parasuraman   President  and  CTO   CTO,  IBM  Big  Data  Solutions  
  • 2. Your  Hosts     Gary Angel, Semphonic President and Co-Founder   20+ years experience with BI & database marketing   15 years experience with digital measurement   Leading industry expert, speaker, blogger and Semphonic practice leader for advanced analytics   Selected: Digital Analytics Association (formerly WAA) Most Influential Industry Contributor: 2012   Krishnan Parasuraman, CTO Big Data Solutions, IBM   15+ years experience with Large scale distributed information systems   Background in product development, consulting and technology management   Leading authority on big data technologies such as massively parallel data warehousing and Hadoop   Author of the book Harness the Power of Big Data
  • 3. Talking  Points  for  today’s  discussion   •  Challenges  with  Digital  Marke7ng  and  Analy7cs.  What   makes  this  problem  so  unique  and  different?   •  Why  is  it  hard  to  use  tradi7onal  database  technologies  to   analyze  Digital  Data?   •  What  type  of  framework  would  we  use  to  evaluate  and   select  the  right  technology  stack  for  Digital  Analy7cs  need?   •  How  does  IBM’s  stack  address  the  needs  of  Digital   Marke7ng?  
  • 4. Introducing  Semphonic     Founded in 1997 and exclusively focused on digital measurement and digital customer analytics   Deep expertise in traditional Web analytics solutions (Omniture, GA Premium, IBM, etc.) AND in the use of advanced technologies for warehousing, integrating, and analyzing digital data.   Practice focused on high-end customer analytics including:   Digital Segmentation   Site optimization and Personalization   Customer Analytics   Attribution Analysis & Media Mix Modeling   Digital Data Models for the Warehouse
  • 5. Two  Worlds  Divided   BI  &  Customer   Analy3cs   Web  Analy3cs   and  Digital   •  Tradi7onal  BI  and  Customer  Analy7cs  teams  have  deep  methods   and  powerful  tools.  But  digital  data  is  surprisingly  different  and   challenging.   •  Digital  Measurement  professionals  lack  the  tools,  the   methodology,  and  the  exper7se  to  do  mul7-­‐channel  analy7cs.  
  • 6. Our  Goal  is  to  Bring  Them  Together   Statistical Models Proven Actionable Online Behavior Demographics Email Marketing Database   Web     Marke3ng   Analy3cs   Database-Driven Event Driven Social Old SaaS List Enhancement Customer Driven
  • 7. Digital  Analy7cs     in  a  Big-­‐Data  World  
  • 8. Digital  Analy7cs  is  a  Paradigm  Big  Data  Applica7on   Digital  Measurement  is  a  paradigm  case  of  big-­‐data:   •  Lot’s  of  data   –  Millions  (hundreds  of?)  events  per  day   –  Lots  of  data  per  event   •  Lot’s  of  key  High  Cardinality  variables     –  Page  Name,  Product  Sets,  Referrers,  Campaigns,  Keywords   –  and  Customers   •  Focus  on  Detail-­‐Level  Analy7cs:   –  Customer  Life7me  Value   –  Full  mul7-­‐touch  aYribu7on   •  Lack  of  meaning  at  the  Row-­‐Level   –  In  digital,  meaning  exists  in  a  collec7on  of  records.  
  • 9. Why  it’s  Challenging   These  unique  aspects  of  digital  data  make  it  difficult  for  most  tradi3onal   technology  stacks  to  support  effec3ve  digital  measurement  and  analysis:   Large  Row   Defeats  systems  not  setup  to  op3mize  full  table  scans   Volumes   ONen  creates  unmanageable  indexing  sizes     Creates  basic  load  and  availability  issues   High   Defeats  many  classic  OLAP  strategies     Cardinality   Forces  full-­‐table  or  index  scans  of  the  data   Focus  on   Defeats  aggrega3on  strategies   Detail  Level   Analy3cs   No  opportunity  for  fixed  aggregates  to  succeed   Lack  of   Defeats  simple  aggrega3on  strategies   Meaning  at   the  Row  Level   Defeats  tradi3onal  row-­‐based  ETL  
  • 10. Digital  Data:  Meaning  &   Integra7on  
  • 11. Why  Digital  Data  IS  DIFFERENT   •  Here’s  why  your  tradi7onal  BI  and  Customer  Analy7cs  folks   struggle  with  Digital:   –  There  are  no  domain  experts   –  Nearly  all    digital  data  is  stream  data   –  Unlike  transac7on  data,  digital  streams  don’t  aggregate  cleanly   –  Digital  Data  o]en  contains  a  hidden  topographic  structure   Most  modeling   Tradi7onal  data   Unlike   and  analysis   modeling  relied   Transac7on   Hidden   systems  provide   on  domain   data,  digital   Structure  skews   row-­‐based   experts.  These   stream  data   Basic  Sta7s7cal   analysis.   don’t  exist  in   doesn’t   Analysis   Analy7cs  data  is   digital   aggregate   stream  data.  
  • 12. Talking  Streams  (More  detail   because  this  is  hard  to  convey)  
  • 13. Aggrega7on  of  Streams   •  Aggrega7on  of  streams  is  cri7cal  to  effec7ve  digital  measurement   –  This  isn’t  because  of  performance  (though  it  helps)   –  Digital  data  has  meaning  as  a  collec7on  not  a  single  row   –  So  no  maYer  how  powerful  your  processing  system,  you  need  to  understand  whole  sequences  of   behavior     •  Tradi7onal  aggrega7on  doesn’t  work:   Transaction Page View Total Transactions Total Page Views
  • 14. Why  Streams  MaYer   •  The  single  biggest  driver  of  digital  analy7cs  measurement  is  the   need  to  de-­‐silo  data.     •  Proper  answers  to  ALL  of  these  ques7ons  require  mul7-­‐channel   data  integra7on.     –  Where  do  mobile  apps  fit  in  the  broader  customer  journey?   –  How  does  web  engagement  translate  into  offline  sales?   –  How  do  my  best  offline  customers  use  the  digital  channel?   –  What’s  the  Predicted  Life7me  Value  of  a  Digital  Lead?     –  What’s  the  value  of  a  Facebook  Fan?   –  What  impact  does  Posi7ve  Social  ChaYer  have  on  Brand  Affinity?   –  What  content  on  my  Website  is  most  effec7ve?  
  • 15. A  Quick  Primer  on  Joins   Joining  one  type  of  Customer   Record  to  another  yields  a  single   row  per  customer  with  an  easy   to  use  combined  record.   Joining  a  Customer  Record  to   Geo  or  Census  data  yields  a   single  row  per  customer  with  an   easy  to  use  record.  
  • 16. And  Why  Streams  are  Pain   Combining  streams  like  digital   and  mobile  –  even  with  a  join   key  –  just  yields  two  dis3nct   streams.  The  join  doesn’t   simplify  analysis.   Pu[ng  mul3ple  digital   data  sources  on  the  same   box  WITH  join  keys  doesn’t   really  solve  the  problem.  
  • 17. Every  Sub-­‐Channel  has  Dis7nct  Streams   •  One  of  the  HUGE  challenges  facing  a  digital  technology  stack  is   that  almost  every  digital  source  is  quite  different.   –  One  of  the  most  common  failure  points  we  see  is  the  assump7on  that  pudng   the  data  in  one  place  makes  it  useful.     –  Given  the  challenges  of  stream  analy7cs  in  a  single  sub-­‐channel,  asking  the   analyst  to  join  streams  on  un-­‐modified  data  is  overly-­‐op7mis7c.     –  An  effec7ve  data  model  has  to  provide  a  means  of  unifying  sub-­‐channels  in  a   coherent  structure.  
  • 18. Choosing  the  Right  Digital   Technology  Stack  
  • 19. Don’t  Have  a  Homer  Moment  (D’oh)   •  Crea7ng  a  strong  founda7on  for  assessment  begins  with  your   business  purposes.  Each  of  these  puts  different  stresses  on  the   underlying  technology  stack:   Advanced  Web   Customer   Analy3cs   Modeling   Personaliza3on   Email  Targe3ng   Site   Loyalty  Program   Merchandising   Enterprise   Personaliza3on   Analy3cs   Analy3cs   Dashboarding   Social  Media   Opera3ons  (Call   Analy3cs   Avoidance,  etc.)  
  • 20. Here  are  the  Key  Decision  Vectors     •  We’ve  matched  the  business  func7ons  to  the  following  key   aYributes  of  various  big  data  technology  stacks:   The  goal  is  to  help  you  assess   what  technology  trade-­‐offs  best   fit  your  needs.  
  • 21. Decision  Vectors   Advanced  Web  Analy7cs   Advanced  Web  Analy7cs  &  Hadoop   Handling  Huge   Volume   Up3me/Load   90   Miminize  Data   Handling  Huge   80   Volume   Without  Disrup3on   70   Modeling   Up3me/Load   90   60   80   Miminize  Data   Without   50   70   Modeling   Minimize   Easy  Data   Disrup3on   60   40   50   Administra3on   30   Integra3on   Minimize   40   Easy  Data   20   Administra3on   30   Integra3on   10   20   0   10   Support  Integrated   0   Support  Integrated   Real-­‐3me  Support   Marke3ng  Solu3ons   Real-­‐3me  Support   Marke3ng   Solu3ons   Support   Support  Algorithmic   Support  BI  Tools   Algorithmic   Support  BI  Tools   Queries   Queries   Support  Stats   Exper3se  Available   Support  Stats  Tools   Exper3se  Available   Tools   Advanced  Web  Analy3cs   Hadoop  
  • 22. And  Here’s  a  Snapshot  of  the  Decision  Matrix  
  • 23. Most  Common  Failure  Points   Here  are  some  common  risk  points:   • Insis3ng  on  Too  Much  History   Data  Windows     • Using  a  single  technology     • Keeping  too  much  data   ETL   • Missing  Join  Keys   • Failure  to  Reckon  with  Streams   Integra3on   • Assump3on  that  a  key  is  all  that’s  necessary   • Ad  Hoc  Effort  instead  of  up-­‐front  segmenta3on   Analy3cs   • Failure  to  understand  Topology   • Lack  of  structure   Data  Democra3za3on   • Tool  Complexity   • Single  Technology  Stack   Real-­‐3me   • Unrealis3c  expecta3ons  
  • 24. Digital  Analy7cs  is  a  Paradigm  Big  Data  Applica7on   Digital  Measurement  is  a  paradigm  case  of  big-­‐ data:   •  Lot’s  of  data   –  Millions  (hundreds  of?)  events  per  day   –  Lots  of  data  per  event   •  Lot’s  of  key  High  Cardinality  variables     –  Page  Name,  Product  Sets,  Referrers,  Campaigns,   Keywords   –  and  Customers   •  Focus  on  Detail-­‐Level  Analy7cs:   –  Customer  Life7me  Value   –  Full  mul7-­‐touch  aYribu7on   •  Lack  of  meaning  at  the  Row-­‐Level   –  In  digital,  meaning  exists  in  a  collec7on  of  records.  
  • 25. Digital  Analy7cs  is  a  Paradigm  Big  Data  Applica7on   Digital  Measurement  is  a  paradigm  case  of  big-­‐ data:   •  Lot’s  of  data   –  Millions  (hundreds  of?)  events  per  day   –  Lots  of  data  per  event   •  Lot’s  of  key  High  Cardinality  variables     –  Page  Name,  Product  Sets,  Referrers,  Campaigns,   Keywords   BIG  DATA   –  and  Customers   PLATFORM   •  Focus  on  Detail-­‐Level  Analy7cs:   –  Customer  Life7me  Value   –  Full  mul7-­‐touch  aYribu7on   •  Lack  of  meaning  at  the  Row-­‐Level   –  In  digital,  meaning  exists  in  a  collec7on  of  records.  
  • 26. The  Big  Data  Plaform  Requirements   Analyze  Extreme  Volumes  of  Data   Impressions   Online,  Offline,  Social,  Behavior,  First  Party  &   Cookies   Third  Party  across  mul3ple  channels   Online   Registra3ons   Purchase  Transac3ons   Analyze  Wide  Variety  of  Data   In-­‐Market  Intent   Structured  –  POS,  3rd  Party,  Transac3ons   Unstructured  –  Social,  Video,  Blogs   Influence   Semi-­‐Structured  –  Cookies,  Impressions   Sen3ments   BIG  DATA   Social   Followers   Analyze  Data  in  Real  Time   Recommenda3ons   Likes   PLATFORM   Product  Recommenda3ons,  Real  Time  offers,   Targeted  Ads  in  Real  Time   Psychographic  surveys   Geo-­‐Demographic   Discover  &  Experiment   3rd  Party   Segments   Ad-­‐hoc  analy3cs,  data  discovery  &   Offline  Transac3ons   experimenta3on   Responses   Governance   Enforce  data  structure,  integrity  and   control  to  ensure  consistency    
  • 27. IBM’s  Big  Data  Plaform   Impressions   Netezza   Cookies   •  Extreme  Performance   Online   Registra3ons   •  In-­‐Database  Analy3cs   Purchase  Transac3ons   In-­‐Market  Intent   •  Scalable  Appliance   Influence   Sen3ments   Streams   BIG  DATA   Social   Followers   •  Act  on  Data  “In-­‐Mo3on”   Recommenda3ons   Likes   PLATFORM   •  Real  3me  analy3cs   •  Alerts/Ac3ons   Psychographic  surveys   Geo-­‐Demographic   3rd  Party   Segments   Offline  Transac3ons   Big  Insights   Responses   •  Hadoop/  Unstructured   Data   •  Complex  Analy3cs  
  • 28. Audience  Q&A   Download  the  Full  Whitepaper  at:   hrp://www.semphonic.com/a-­‐big-­‐data-­‐technology-­‐stack.html   Learn  more  about  IBM’s  big  data  solu3ons  at:   hrp://www.ibmbigdatahub.com   hrp://www.analyzingmedia.com