SlideShare a Scribd company logo
1 of 4
Download to read offline
Proposal for Industry Presentation with Demo
                 Targeting for Computational Market Research
                                                            Frank Smadja
                                                             Toluna Inc.
                                                     MATAM, Haifa 31905, ISRAEL
                                                      frank.smadja@toluna.com

ABSTRACT                                                                     wants to better understand people’s opinions on fruit yogurt
We introduce here the concept of “computational market                       would like answers to questions such as: “do people prefer yogurt
research” by drawing on the analogy with computational                       with fruit at the bottom, at the top, or blended with the yogurt
advertising. We first explain how market research traditionally              itself?, is there a significant difference in terms of demographic
works and we describe the current state of the art in the field              characteristics between people who prefer the fruit at the bottom,
through examples. We then explain how, by introducing                        at the top or blended?” To obtain the answers to such questions
appropriate targeting and user modeling techniques, market                   the manufacturer will need to survey a number of consumers and
research can be conducted at a larger scale, in a more efficient (for        ask them a series of questions on their yogurt eating habits and
companies) and more enjoyable (for users) manner. We focus on                preferences.
the technical challenges of user modeling for the specific goals of          A critical part of a market research study is the selection of the
computational market research, illustrating our purpose with                 specific users who will give this type of feedback. This selection
actual examples from toluna.com. Toluna.com is one of the                    process is usually conducted through a combination of specific
leading Web2.0 sites for polls, surveys and opinions, where users            demographic criteria (age, gender, place of residence, income
can voluntarily join market research panels. We show real life               level, etc.) and domain specific questions such as: “do you eat
examples on a live demo during the workshop, and demonstrate                 yogurt regularly? do you ride a bike regularly? do you use aspirin
novel features for polling and user qualification on the toluna.com          on a daily basis?” etc. This process consists of a targeting step
site. We conclude by discussing future research directions for this          and a screening step. The targeting step is typically driven by
emerging field.                                                              basic demographic attributes, for example a cosmetic brand might
                                                                             decide to mostly address young mothers. The screening step is
Categories and Subject Descriptors                                           driven by specific domain questions, using the same cosmetic
H.1.1 [Models and Principles]: User/Machine Systems                          example, whether the young mother has a dry skin, or whether she
                                                                             has long hair, etc. The users being rejected at the screening
General Terms                                                                process are referred to as screened out users or as screenouts for
Measurement, Economics, Experimentation, Human Factors.                      short. A user not being targeted will simply not get invited to
                                                                             participate in the survey. A user who successfully went through
Keywords                                                                     the selection process and completed the survey is referred to as a
Computational Market Research, User Models, Market Research,                 complete.
Placement Ads, Computational Advertising.
                                                                             Typically, the customer (the buyer) pays only for each complete
1. INTRODUCTION                                                              and not for screen-outs, thus the burden of selecting a good target
                                                                             user group falls on the market research company (the seller)
1.1 Market Research: Common Practices and                                    inviting the users to answer specific surveys. The cost per
Challenges                                                                   complete usually reflects the targeting requirements and the
Market research is a marketing tool that allows companies to                 specificity of the screening process; however the pricing is usually
receive feedback from their customers or consumers on a variety              done manually. This can be compared to a “guaranteed delivery”
of topics; it is usually intended to gather market intelligence, to          system for display ads as described in [1]. This user qualification
identify and analyze market needs for specific products in order to          process is similar to targeting in computational advertising and is
launch a new product or change an existing one. Traditionally,               critical to a successful market research study.
this feedback is obtained via surveys that target a specific segment         Standard market research practice consists of simply including the
of the population. For example, a dairy product manufacturer that            screening process as part of the survey as a set of preliminary
                                                                             questions. It is relatively easy to store for each user the answers to
 UMWA’2011, February 9, 2011, Hong-Kong, China.                              demographic questions and reuse them in the future for other
 Copyright 2010 ACM 1-58113-000-0/00/0010…$10.00.
                                                                             surveys. However, it is very hard to know or predict the answers
 Permission to make digital or hard copies of all or part of this work for
                                                                             to various “domain” questions (e.g., owners of two dogs, bike
 personal or classroom use is granted without fee provided that copies are
 not made or distributed for profit or commercial advantage and that         riders, Alpha Romeo owners, Xbox players, etc.) These change
 copies bear this notice and the full citation on the first page. To copy    all the time and cover a huge range of domains. In addition, some
 otherwise, or republish, to post on servers or to redistribute to lists,    completes are more valuable than others, for example, there is
 requires prior specific permission and/or a fee.                            more demand for “IT managers” than for “middle age
                                                                             housewives”. Similarly, the screening process has an influence on
pricing as it is easier for instance to identify 2,000 people with a   and fun factor might not lead to timely completion of a survey and
driver’s license than 2,000 people who own two dogs. The typical       thus not serve the interest of the seller.
overall funnel-like process is depicted in Figure 1.                   Computational market research offers a new way to address this
                                                                       matching problem by taking advantage of the three following
                                                                       observations:
                                                                              •    Economy of Scale: With the penetration of the
                                                                                   Internet, it is possible now to reach millions of users and
                                                                                   thus conduct more and more surveys on larger and
                                                                                   larger populations for more and more precise results.
                                                                                   Traditional market research with a few customers and
                                                                                   dozens of surveys suddenly become obsolete.
                                                                              •    Probe users in context: As users are more connected
                                                                                   than ever, it becomes easier to reach them at the right
                                                                                   time, in the right context rather than interrupting them at
                                                                                   home by annoying phone calls. A survey offers much
                                                                                   resemblance with an online display ad and as such has
                                                                                   more chance to lead to conversion when presented in
        Figure 1: The Typical Market Research Funnel                               the right context. A survey is answered by redirecting
                                                                                   traffic to it and not by inviting users to answer it in an
                                                                                   offline manner.
1.2 Computational Market Research: Getting                                    •    Make market research accessible to all: Market
to the Next Level                                                                  research should be accessible to smaller companies and
In a similar way to computational advertising [1], computational                   individuals, and thus more affordable. This means that
market research can be defined as a set of techniques for finding                  the market should move towards a do-it-yourself
the best match between a user in a given context and a suitable                    approach, sell less service and increase automation. This
survey.     There are several players with somehow conflicting                     again reminds the ads market
interests involved in the system: The customer who wants people        The basic requirement is thus to think of the problem
to answer a specific survey or questionnaire (the buyer), the user     algorithmically. For example, pricing, targeting and routing
answering the survey and the market research company in charge         should be done automatically on live traffic in a very similar way
of the execution and delivery of the survey (the seller). These        that is done with display ads in the industry. These are no small
players have different interests: The user wants to have a fun         challenges, as, for example, it is easy to price a specific
experience and is highly motivated by social and financial             demographic target (say young mothers) but hard for domain
incentives, the customer would like a high ROI, and the seller is      attributes (people with two dogs, people owning an Xbox, people
interested in revenues. As illustrated in Figure 2, the role of the    who commute more than 2 hours a day, etc.). Is it more expensive
survey router is to maximize the interests of the various players      to reach “people who ride a bike to work” or “people who own
and find the right balance.                                            two dogs”? See [2] for pricing models for placement ads.
                                                                       In Table 1 below we show the analogy between computational
                                                                       market research and computational advertisement on a number of
                                                                       parameters.


                                                                         Table 1: Computational Advertisement vs Computational
                                                                                           Market Research
                                                                       Concept                  Advertisement           Market Research
                                                                       Tool                     Display ad              Poll, Survey
                                                                       Conversion               Click through           Complete
                                                                       Pricing                  Guaranteed              Fixed price
                                                                                                delivery
                                                                       Irrelevant user          Worthless to            Screen-out
                                                                                                advertiser
                                                                       Targeting                Targeting               Targeting
      Figure 2: Computational Market Research Players                  Live targeting           Behavioral              Screening


Maximizing revenue only would result in user fatigue and low
customer satisfaction; similarly, maximizing only the user interest
targeting                                     Toluna.com. On average, the site sees over 50,000 paid-for
                                                                       surveys completes every day.
Rejection                Users electing not     Screen-out
                         to receive Ads1                               There are two main things that distinguishes Toluna.com to other
                                                                       social sites, first and foremost, Toluna.com is a social site geared
User Context             Browsing history,      Demographics,
                                                                       towards polls, opinions and surveys. In addition, users can easily
                         behavior,              Previous answers,
                                                                       participate in professional surveys and polls that are present on the
                         demographics           surveys
                                                                       site. The professional polls are referred to as “sponsored polls”
Perception of            Spam, Noise            Waste, frustration     and the user usually receive points when answering them. With
Irrelevance                                                            their points they can buy purchase vouchers (for example Amazon
Incentive for user       Buy, find target       Fun, social rewards    purchase cards) or even get cash.         The financial incentive is
                                                & financial            essential to compensate users for their time, some surveys can
                                                incentives             take more than 15mn to answer and not all of them are interesting.
                                                                       As demonstrated by Raban in [3] in the community Answers
User experience          Few clicks             Several minutes        domain, financial incentive is critical to attract initial users even if
duration                                                               long-term engagement relies on social rewards. We therefore use
                                                                       a mix of financial and social incentives for both sponsored and
                                                                       organic polls as discussed in [4].
2. USER MODELING FOR                                                   Figure 3 below illustrates a sponsored poll generated by our panel
COMPUTATIONAL MARKET RESEARCH                                          team in order to get answers on a generic topic. In this case, the
As discussed above, in order to apply a scalable approach to           parameter was the activity level of the users on social sites. The
market research and truly turn the field into computational market     goal of the sponsored polls is mostly to enhance targeting
research, we need an automated mechanism to gather                     capabilities.
demographic and domain attributes on a large population of users.
Assume for example, that we need to find 5,000 people who ride
bikes and live in the area of Central London. If we already have
2,000 available users who already answered a biking poll in the
past and told us that they regularly ride bikes, the task at hand is
to identify the additional 3,000 users. Knowing that on average
only 5% of the London population is actually riding bikes,
sending traffic or email invites indiscriminately would lead to
sending invitations to 60,000 users with a screen-out rate of 95%.
The result is easy to imagine, the survey would cost a lot and the
customer would not be happy. In addition, the users being                        Figure 3: A "sponsored" poll on toluna.com
screened out would be annoyed and rapidly get tired of answering
                                                                       Figure 4 below shows an organic poll generated by a user for no
even relevant surveys in the future. This is where user modeling
                                                                       other purpose than social engagement.
comes into play. In order to automate market research, we need a
user model that consists of a set of demographic and domain
attributes. Such a user model is central to the automation of the
targeting and screening stages. It would allow the market research
company to price and route surveys properly and in a more
efficient way than is currently done in traditional market research.
Like with ads, a relevant survey can be appreciated by users,
while an irrelevant one is seen as spam.


3. AN EXAMPLE: TOLUNA.COM                                                          Figure 4: An organic poll on Toluna.com
 Toluna.com is one of the most active social sites for online voting
and opinions. It is a Web2.0 site completely geared towards polls,
surveys and opinions of users. Toluna members can voice their          Figure 6 below shows an organic topic launched by a user with no
opinions on any topic but they can also poll the community and         incentive other than getting other people’s opinions. The topics
get other users’ opinions. Toluna currently counts more than 4         are answered as open-end text answers.
million active users worldwide. In November 2010 alone, users
voted 30 million times (e.g., a rate of 1 million votes a day),        This combination of organic and sponsored polls as well as social
created 90,000 polls and topics and expressed about 700,000 full       and financial incentives is what makes toluna.com unique. We
text opinions on a huge range of topics. Traffic is constantly         advocate a computational market research approach by applying
growing, during that same month 180,000 new users registered to        the following principles:
                                                                            1.   Gather users’ demographic and domain attributes about
                                                                                 users through organic polls and thus build an ever
                                                                                 growing user model
1
    Some search engines allow their users not to be exposed to Ads          2.   Leverage users’ model for automatic targeting and
    relating to given market domains such as gaming, electronic etc.             screening of sponsored polls.
to a survey. As a direct consequence, we significantly reduce for
Note that organic polls are in vast majority initiated by users in a   users the frustration of being screened out and bring down the
natural manner (over 95% of all polls are organic) and are critical    price per complete to an affordable level.
to successful users’ engagement on the site. Toluna editors can
also initiate polls which are not paid for by any customer but can
either increase engagement on hot topics or gather new attributes      4. CONCLUSION AND FUTURE
that are expected to be relevant to paying customers in the future.    DIRECTIONS
Such polls often trigger more polls, user-initiated this time and      We have described here how market research can truly become
thus continue enriching the user model at low cost.                    “computational” by merging the screening and targeting stages
                                                                       and have explained how at toluna.com we used a mix of organic
                                                                       and sponsored polls, as well as social and financial incentives to
                                                                       build a scalable users’ base that supports this approach.
                                                                       We believe that the qualification process however can still be
                                                                       improved, so as to reduce the need for editors to generate organic
                                                                       polls preemptively for expected domain of interests. Indeed, one
                                                                       of the key challenges, of computational market research, which
                                                                       also exists in display ads, is that we cannot predict ahead of time
                                                                       which types of domains and associated features, our customers
                                                                       will be interested in. For recurring features, we could consider
                                                                       training for instance a “biker classifier” or a “two-dog owner”
                                                                       classifier, but in the long run we need to be able to assemble
                                                                       atomic features on the fly so as to generate “on demand” the
                                                                       appropriate user models for a given survey.
                                                                       Another venue of research is to use “users similarity” models,
                                                                       where we seed our system with qualified users obtained possibly
                                                                       via traditional “manual” methods and then identify similar users
                                                                       based on similar behavior towards polls, surveys and opinions on
                                                                       the site. We are considering traditional recommender systems
                                                                       technologies for this purpose.
       Figure 3: An organic opinion topic on toluna.com
                                                                       We believe that computational market research is still in its
                                                                       infancy and has much to learn from the progress of computational
Our primary effort at Toluna, consists of replacing the email          advertising in the last few years.
invitation process by a “live” selection of traffic; as users on
toluna.com answer polls and give their opinions, some of their
responses automatically qualifies them and seamlessly transfer         5. REFERENCES
them to a sponsored survey. We are thus merging the targeting          [1] Andrei Broder and Vanja Josifovski. Introduction to
and screening processes into a single “qualification” process as           Computational Advertising, Yahoo! Research and Stanford
shown in Figure 5 below.                                                   University.
                                                                           http://www.stanford.edu/class/msande239/lectures-
                                                                           2010/lecture-07.pdf
                                                                       [2] Arpita Ghosh, Preston McAfee, Kishore Papineni, and Sergei
                                                                           Vassilvitskii. Bidding for representative allocations for
                                                                           display advertising. CoRR, abs/0910-0880, 2009.
                                                                           http://arxiv.org/abs/0910.0880
                                                                       [3] Daphne Raban. The Incentive Structure in an Online
                                                                           Information Market; Journal of the American Society for
                                                                           Information Science and Technology, 2008.
                                                                           http://gsb.haifa.ac.il/~draban/home/Raban_JASIST2008.pdf
                                                                       [4] Frank Smadja, “Mixing Financial, Social and Fun Incentives
                                                                           for Social Voting”, Webcentives, 1st International workshop
                                                                           on Motivation and Incentives on the Web. Collocated with
    Figure 5: The Computerized Market Research Funnel                      WWW09, Madrid, Spain. http://webcentives09.sti-
Merging the screening and targeting steps makes a huge                     innsbruck.at/proceedings-webcentives.pdf
difference both for the user and for the customer order the survey.
The first advantage is that we eliminate the static selection of
users and email invitations and instead, we send qualified traffic

More Related Content

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Targeting for Computational Market Research

  • 1. Proposal for Industry Presentation with Demo Targeting for Computational Market Research Frank Smadja Toluna Inc. MATAM, Haifa 31905, ISRAEL frank.smadja@toluna.com ABSTRACT wants to better understand people’s opinions on fruit yogurt We introduce here the concept of “computational market would like answers to questions such as: “do people prefer yogurt research” by drawing on the analogy with computational with fruit at the bottom, at the top, or blended with the yogurt advertising. We first explain how market research traditionally itself?, is there a significant difference in terms of demographic works and we describe the current state of the art in the field characteristics between people who prefer the fruit at the bottom, through examples. We then explain how, by introducing at the top or blended?” To obtain the answers to such questions appropriate targeting and user modeling techniques, market the manufacturer will need to survey a number of consumers and research can be conducted at a larger scale, in a more efficient (for ask them a series of questions on their yogurt eating habits and companies) and more enjoyable (for users) manner. We focus on preferences. the technical challenges of user modeling for the specific goals of A critical part of a market research study is the selection of the computational market research, illustrating our purpose with specific users who will give this type of feedback. This selection actual examples from toluna.com. Toluna.com is one of the process is usually conducted through a combination of specific leading Web2.0 sites for polls, surveys and opinions, where users demographic criteria (age, gender, place of residence, income can voluntarily join market research panels. We show real life level, etc.) and domain specific questions such as: “do you eat examples on a live demo during the workshop, and demonstrate yogurt regularly? do you ride a bike regularly? do you use aspirin novel features for polling and user qualification on the toluna.com on a daily basis?” etc. This process consists of a targeting step site. We conclude by discussing future research directions for this and a screening step. The targeting step is typically driven by emerging field. basic demographic attributes, for example a cosmetic brand might decide to mostly address young mothers. The screening step is Categories and Subject Descriptors driven by specific domain questions, using the same cosmetic H.1.1 [Models and Principles]: User/Machine Systems example, whether the young mother has a dry skin, or whether she has long hair, etc. The users being rejected at the screening General Terms process are referred to as screened out users or as screenouts for Measurement, Economics, Experimentation, Human Factors. short. A user not being targeted will simply not get invited to participate in the survey. A user who successfully went through Keywords the selection process and completed the survey is referred to as a Computational Market Research, User Models, Market Research, complete. Placement Ads, Computational Advertising. Typically, the customer (the buyer) pays only for each complete 1. INTRODUCTION and not for screen-outs, thus the burden of selecting a good target user group falls on the market research company (the seller) 1.1 Market Research: Common Practices and inviting the users to answer specific surveys. The cost per Challenges complete usually reflects the targeting requirements and the Market research is a marketing tool that allows companies to specificity of the screening process; however the pricing is usually receive feedback from their customers or consumers on a variety done manually. This can be compared to a “guaranteed delivery” of topics; it is usually intended to gather market intelligence, to system for display ads as described in [1]. This user qualification identify and analyze market needs for specific products in order to process is similar to targeting in computational advertising and is launch a new product or change an existing one. Traditionally, critical to a successful market research study. this feedback is obtained via surveys that target a specific segment Standard market research practice consists of simply including the of the population. For example, a dairy product manufacturer that screening process as part of the survey as a set of preliminary questions. It is relatively easy to store for each user the answers to UMWA’2011, February 9, 2011, Hong-Kong, China. demographic questions and reuse them in the future for other Copyright 2010 ACM 1-58113-000-0/00/0010…$10.00. surveys. However, it is very hard to know or predict the answers Permission to make digital or hard copies of all or part of this work for to various “domain” questions (e.g., owners of two dogs, bike personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that riders, Alpha Romeo owners, Xbox players, etc.) These change copies bear this notice and the full citation on the first page. To copy all the time and cover a huge range of domains. In addition, some otherwise, or republish, to post on servers or to redistribute to lists, completes are more valuable than others, for example, there is requires prior specific permission and/or a fee. more demand for “IT managers” than for “middle age housewives”. Similarly, the screening process has an influence on
  • 2. pricing as it is easier for instance to identify 2,000 people with a and fun factor might not lead to timely completion of a survey and driver’s license than 2,000 people who own two dogs. The typical thus not serve the interest of the seller. overall funnel-like process is depicted in Figure 1. Computational market research offers a new way to address this matching problem by taking advantage of the three following observations: • Economy of Scale: With the penetration of the Internet, it is possible now to reach millions of users and thus conduct more and more surveys on larger and larger populations for more and more precise results. Traditional market research with a few customers and dozens of surveys suddenly become obsolete. • Probe users in context: As users are more connected than ever, it becomes easier to reach them at the right time, in the right context rather than interrupting them at home by annoying phone calls. A survey offers much resemblance with an online display ad and as such has more chance to lead to conversion when presented in Figure 1: The Typical Market Research Funnel the right context. A survey is answered by redirecting traffic to it and not by inviting users to answer it in an offline manner. 1.2 Computational Market Research: Getting • Make market research accessible to all: Market to the Next Level research should be accessible to smaller companies and In a similar way to computational advertising [1], computational individuals, and thus more affordable. This means that market research can be defined as a set of techniques for finding the market should move towards a do-it-yourself the best match between a user in a given context and a suitable approach, sell less service and increase automation. This survey. There are several players with somehow conflicting again reminds the ads market interests involved in the system: The customer who wants people The basic requirement is thus to think of the problem to answer a specific survey or questionnaire (the buyer), the user algorithmically. For example, pricing, targeting and routing answering the survey and the market research company in charge should be done automatically on live traffic in a very similar way of the execution and delivery of the survey (the seller). These that is done with display ads in the industry. These are no small players have different interests: The user wants to have a fun challenges, as, for example, it is easy to price a specific experience and is highly motivated by social and financial demographic target (say young mothers) but hard for domain incentives, the customer would like a high ROI, and the seller is attributes (people with two dogs, people owning an Xbox, people interested in revenues. As illustrated in Figure 2, the role of the who commute more than 2 hours a day, etc.). Is it more expensive survey router is to maximize the interests of the various players to reach “people who ride a bike to work” or “people who own and find the right balance. two dogs”? See [2] for pricing models for placement ads. In Table 1 below we show the analogy between computational market research and computational advertisement on a number of parameters. Table 1: Computational Advertisement vs Computational Market Research Concept Advertisement Market Research Tool Display ad Poll, Survey Conversion Click through Complete Pricing Guaranteed Fixed price delivery Irrelevant user Worthless to Screen-out advertiser Targeting Targeting Targeting Figure 2: Computational Market Research Players Live targeting Behavioral Screening Maximizing revenue only would result in user fatigue and low customer satisfaction; similarly, maximizing only the user interest
  • 3. targeting Toluna.com. On average, the site sees over 50,000 paid-for surveys completes every day. Rejection Users electing not Screen-out to receive Ads1 There are two main things that distinguishes Toluna.com to other social sites, first and foremost, Toluna.com is a social site geared User Context Browsing history, Demographics, towards polls, opinions and surveys. In addition, users can easily behavior, Previous answers, participate in professional surveys and polls that are present on the demographics surveys site. The professional polls are referred to as “sponsored polls” Perception of Spam, Noise Waste, frustration and the user usually receive points when answering them. With Irrelevance their points they can buy purchase vouchers (for example Amazon Incentive for user Buy, find target Fun, social rewards purchase cards) or even get cash. The financial incentive is & financial essential to compensate users for their time, some surveys can incentives take more than 15mn to answer and not all of them are interesting. As demonstrated by Raban in [3] in the community Answers User experience Few clicks Several minutes domain, financial incentive is critical to attract initial users even if duration long-term engagement relies on social rewards. We therefore use a mix of financial and social incentives for both sponsored and organic polls as discussed in [4]. 2. USER MODELING FOR Figure 3 below illustrates a sponsored poll generated by our panel COMPUTATIONAL MARKET RESEARCH team in order to get answers on a generic topic. In this case, the As discussed above, in order to apply a scalable approach to parameter was the activity level of the users on social sites. The market research and truly turn the field into computational market goal of the sponsored polls is mostly to enhance targeting research, we need an automated mechanism to gather capabilities. demographic and domain attributes on a large population of users. Assume for example, that we need to find 5,000 people who ride bikes and live in the area of Central London. If we already have 2,000 available users who already answered a biking poll in the past and told us that they regularly ride bikes, the task at hand is to identify the additional 3,000 users. Knowing that on average only 5% of the London population is actually riding bikes, sending traffic or email invites indiscriminately would lead to sending invitations to 60,000 users with a screen-out rate of 95%. The result is easy to imagine, the survey would cost a lot and the customer would not be happy. In addition, the users being Figure 3: A "sponsored" poll on toluna.com screened out would be annoyed and rapidly get tired of answering Figure 4 below shows an organic poll generated by a user for no even relevant surveys in the future. This is where user modeling other purpose than social engagement. comes into play. In order to automate market research, we need a user model that consists of a set of demographic and domain attributes. Such a user model is central to the automation of the targeting and screening stages. It would allow the market research company to price and route surveys properly and in a more efficient way than is currently done in traditional market research. Like with ads, a relevant survey can be appreciated by users, while an irrelevant one is seen as spam. 3. AN EXAMPLE: TOLUNA.COM Figure 4: An organic poll on Toluna.com Toluna.com is one of the most active social sites for online voting and opinions. It is a Web2.0 site completely geared towards polls, surveys and opinions of users. Toluna members can voice their Figure 6 below shows an organic topic launched by a user with no opinions on any topic but they can also poll the community and incentive other than getting other people’s opinions. The topics get other users’ opinions. Toluna currently counts more than 4 are answered as open-end text answers. million active users worldwide. In November 2010 alone, users voted 30 million times (e.g., a rate of 1 million votes a day), This combination of organic and sponsored polls as well as social created 90,000 polls and topics and expressed about 700,000 full and financial incentives is what makes toluna.com unique. We text opinions on a huge range of topics. Traffic is constantly advocate a computational market research approach by applying growing, during that same month 180,000 new users registered to the following principles: 1. Gather users’ demographic and domain attributes about users through organic polls and thus build an ever growing user model 1 Some search engines allow their users not to be exposed to Ads 2. Leverage users’ model for automatic targeting and relating to given market domains such as gaming, electronic etc. screening of sponsored polls.
  • 4. to a survey. As a direct consequence, we significantly reduce for Note that organic polls are in vast majority initiated by users in a users the frustration of being screened out and bring down the natural manner (over 95% of all polls are organic) and are critical price per complete to an affordable level. to successful users’ engagement on the site. Toluna editors can also initiate polls which are not paid for by any customer but can either increase engagement on hot topics or gather new attributes 4. CONCLUSION AND FUTURE that are expected to be relevant to paying customers in the future. DIRECTIONS Such polls often trigger more polls, user-initiated this time and We have described here how market research can truly become thus continue enriching the user model at low cost. “computational” by merging the screening and targeting stages and have explained how at toluna.com we used a mix of organic and sponsored polls, as well as social and financial incentives to build a scalable users’ base that supports this approach. We believe that the qualification process however can still be improved, so as to reduce the need for editors to generate organic polls preemptively for expected domain of interests. Indeed, one of the key challenges, of computational market research, which also exists in display ads, is that we cannot predict ahead of time which types of domains and associated features, our customers will be interested in. For recurring features, we could consider training for instance a “biker classifier” or a “two-dog owner” classifier, but in the long run we need to be able to assemble atomic features on the fly so as to generate “on demand” the appropriate user models for a given survey. Another venue of research is to use “users similarity” models, where we seed our system with qualified users obtained possibly via traditional “manual” methods and then identify similar users based on similar behavior towards polls, surveys and opinions on the site. We are considering traditional recommender systems technologies for this purpose. Figure 3: An organic opinion topic on toluna.com We believe that computational market research is still in its infancy and has much to learn from the progress of computational Our primary effort at Toluna, consists of replacing the email advertising in the last few years. invitation process by a “live” selection of traffic; as users on toluna.com answer polls and give their opinions, some of their responses automatically qualifies them and seamlessly transfer 5. REFERENCES them to a sponsored survey. We are thus merging the targeting [1] Andrei Broder and Vanja Josifovski. Introduction to and screening processes into a single “qualification” process as Computational Advertising, Yahoo! Research and Stanford shown in Figure 5 below. University. http://www.stanford.edu/class/msande239/lectures- 2010/lecture-07.pdf [2] Arpita Ghosh, Preston McAfee, Kishore Papineni, and Sergei Vassilvitskii. Bidding for representative allocations for display advertising. CoRR, abs/0910-0880, 2009. http://arxiv.org/abs/0910.0880 [3] Daphne Raban. The Incentive Structure in an Online Information Market; Journal of the American Society for Information Science and Technology, 2008. http://gsb.haifa.ac.il/~draban/home/Raban_JASIST2008.pdf [4] Frank Smadja, “Mixing Financial, Social and Fun Incentives for Social Voting”, Webcentives, 1st International workshop on Motivation and Incentives on the Web. Collocated with Figure 5: The Computerized Market Research Funnel WWW09, Madrid, Spain. http://webcentives09.sti- Merging the screening and targeting steps makes a huge innsbruck.at/proceedings-webcentives.pdf difference both for the user and for the customer order the survey. The first advantage is that we eliminate the static selection of users and email invitations and instead, we send qualified traffic