Proposal for user modeling workshop: http://research.yahoo.com/workshops/umwa2011/
We introduce here the concept of “computational market research” by drawing on the analogy with computational advertising. We first explain how market research traditionally works and we describe the current state of the art in the field through examples. We then explain how, by introducing appropriate targeting and user modeling techniques, market research can be conducted at a larger scale, in a more efficient (for companies) and more enjoyable (for users) manner. We focus on the technical challenges of user modeling for the specific goals of computational market research, illustrating our purpose with actual examples from toluna.com. Toluna.com is one of the leading Web2.0 sites for polls, surveys and opinions, where users can voluntarily join market research panels. We show real life examples on a live demo during the workshop, and demonstrate novel features for polling and user qualification on the toluna.com site. We conclude by discussing future research directions for this emerging field.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Targeting for Computational Market Research
1. Proposal for Industry Presentation with Demo
Targeting for Computational Market Research
Frank Smadja
Toluna Inc.
MATAM, Haifa 31905, ISRAEL
frank.smadja@toluna.com
ABSTRACT wants to better understand people’s opinions on fruit yogurt
We introduce here the concept of “computational market would like answers to questions such as: “do people prefer yogurt
research” by drawing on the analogy with computational with fruit at the bottom, at the top, or blended with the yogurt
advertising. We first explain how market research traditionally itself?, is there a significant difference in terms of demographic
works and we describe the current state of the art in the field characteristics between people who prefer the fruit at the bottom,
through examples. We then explain how, by introducing at the top or blended?” To obtain the answers to such questions
appropriate targeting and user modeling techniques, market the manufacturer will need to survey a number of consumers and
research can be conducted at a larger scale, in a more efficient (for ask them a series of questions on their yogurt eating habits and
companies) and more enjoyable (for users) manner. We focus on preferences.
the technical challenges of user modeling for the specific goals of A critical part of a market research study is the selection of the
computational market research, illustrating our purpose with specific users who will give this type of feedback. This selection
actual examples from toluna.com. Toluna.com is one of the process is usually conducted through a combination of specific
leading Web2.0 sites for polls, surveys and opinions, where users demographic criteria (age, gender, place of residence, income
can voluntarily join market research panels. We show real life level, etc.) and domain specific questions such as: “do you eat
examples on a live demo during the workshop, and demonstrate yogurt regularly? do you ride a bike regularly? do you use aspirin
novel features for polling and user qualification on the toluna.com on a daily basis?” etc. This process consists of a targeting step
site. We conclude by discussing future research directions for this and a screening step. The targeting step is typically driven by
emerging field. basic demographic attributes, for example a cosmetic brand might
decide to mostly address young mothers. The screening step is
Categories and Subject Descriptors driven by specific domain questions, using the same cosmetic
H.1.1 [Models and Principles]: User/Machine Systems example, whether the young mother has a dry skin, or whether she
has long hair, etc. The users being rejected at the screening
General Terms process are referred to as screened out users or as screenouts for
Measurement, Economics, Experimentation, Human Factors. short. A user not being targeted will simply not get invited to
participate in the survey. A user who successfully went through
Keywords the selection process and completed the survey is referred to as a
Computational Market Research, User Models, Market Research, complete.
Placement Ads, Computational Advertising.
Typically, the customer (the buyer) pays only for each complete
1. INTRODUCTION and not for screen-outs, thus the burden of selecting a good target
user group falls on the market research company (the seller)
1.1 Market Research: Common Practices and inviting the users to answer specific surveys. The cost per
Challenges complete usually reflects the targeting requirements and the
Market research is a marketing tool that allows companies to specificity of the screening process; however the pricing is usually
receive feedback from their customers or consumers on a variety done manually. This can be compared to a “guaranteed delivery”
of topics; it is usually intended to gather market intelligence, to system for display ads as described in [1]. This user qualification
identify and analyze market needs for specific products in order to process is similar to targeting in computational advertising and is
launch a new product or change an existing one. Traditionally, critical to a successful market research study.
this feedback is obtained via surveys that target a specific segment Standard market research practice consists of simply including the
of the population. For example, a dairy product manufacturer that screening process as part of the survey as a set of preliminary
questions. It is relatively easy to store for each user the answers to
UMWA’2011, February 9, 2011, Hong-Kong, China. demographic questions and reuse them in the future for other
Copyright 2010 ACM 1-58113-000-0/00/0010…$10.00.
surveys. However, it is very hard to know or predict the answers
Permission to make digital or hard copies of all or part of this work for
to various “domain” questions (e.g., owners of two dogs, bike
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that riders, Alpha Romeo owners, Xbox players, etc.) These change
copies bear this notice and the full citation on the first page. To copy all the time and cover a huge range of domains. In addition, some
otherwise, or republish, to post on servers or to redistribute to lists, completes are more valuable than others, for example, there is
requires prior specific permission and/or a fee. more demand for “IT managers” than for “middle age
housewives”. Similarly, the screening process has an influence on
2. pricing as it is easier for instance to identify 2,000 people with a and fun factor might not lead to timely completion of a survey and
driver’s license than 2,000 people who own two dogs. The typical thus not serve the interest of the seller.
overall funnel-like process is depicted in Figure 1. Computational market research offers a new way to address this
matching problem by taking advantage of the three following
observations:
• Economy of Scale: With the penetration of the
Internet, it is possible now to reach millions of users and
thus conduct more and more surveys on larger and
larger populations for more and more precise results.
Traditional market research with a few customers and
dozens of surveys suddenly become obsolete.
• Probe users in context: As users are more connected
than ever, it becomes easier to reach them at the right
time, in the right context rather than interrupting them at
home by annoying phone calls. A survey offers much
resemblance with an online display ad and as such has
more chance to lead to conversion when presented in
Figure 1: The Typical Market Research Funnel the right context. A survey is answered by redirecting
traffic to it and not by inviting users to answer it in an
offline manner.
1.2 Computational Market Research: Getting • Make market research accessible to all: Market
to the Next Level research should be accessible to smaller companies and
In a similar way to computational advertising [1], computational individuals, and thus more affordable. This means that
market research can be defined as a set of techniques for finding the market should move towards a do-it-yourself
the best match between a user in a given context and a suitable approach, sell less service and increase automation. This
survey. There are several players with somehow conflicting again reminds the ads market
interests involved in the system: The customer who wants people The basic requirement is thus to think of the problem
to answer a specific survey or questionnaire (the buyer), the user algorithmically. For example, pricing, targeting and routing
answering the survey and the market research company in charge should be done automatically on live traffic in a very similar way
of the execution and delivery of the survey (the seller). These that is done with display ads in the industry. These are no small
players have different interests: The user wants to have a fun challenges, as, for example, it is easy to price a specific
experience and is highly motivated by social and financial demographic target (say young mothers) but hard for domain
incentives, the customer would like a high ROI, and the seller is attributes (people with two dogs, people owning an Xbox, people
interested in revenues. As illustrated in Figure 2, the role of the who commute more than 2 hours a day, etc.). Is it more expensive
survey router is to maximize the interests of the various players to reach “people who ride a bike to work” or “people who own
and find the right balance. two dogs”? See [2] for pricing models for placement ads.
In Table 1 below we show the analogy between computational
market research and computational advertisement on a number of
parameters.
Table 1: Computational Advertisement vs Computational
Market Research
Concept Advertisement Market Research
Tool Display ad Poll, Survey
Conversion Click through Complete
Pricing Guaranteed Fixed price
delivery
Irrelevant user Worthless to Screen-out
advertiser
Targeting Targeting Targeting
Figure 2: Computational Market Research Players Live targeting Behavioral Screening
Maximizing revenue only would result in user fatigue and low
customer satisfaction; similarly, maximizing only the user interest
3. targeting Toluna.com. On average, the site sees over 50,000 paid-for
surveys completes every day.
Rejection Users electing not Screen-out
to receive Ads1 There are two main things that distinguishes Toluna.com to other
social sites, first and foremost, Toluna.com is a social site geared
User Context Browsing history, Demographics,
towards polls, opinions and surveys. In addition, users can easily
behavior, Previous answers,
participate in professional surveys and polls that are present on the
demographics surveys
site. The professional polls are referred to as “sponsored polls”
Perception of Spam, Noise Waste, frustration and the user usually receive points when answering them. With
Irrelevance their points they can buy purchase vouchers (for example Amazon
Incentive for user Buy, find target Fun, social rewards purchase cards) or even get cash. The financial incentive is
& financial essential to compensate users for their time, some surveys can
incentives take more than 15mn to answer and not all of them are interesting.
As demonstrated by Raban in [3] in the community Answers
User experience Few clicks Several minutes domain, financial incentive is critical to attract initial users even if
duration long-term engagement relies on social rewards. We therefore use
a mix of financial and social incentives for both sponsored and
organic polls as discussed in [4].
2. USER MODELING FOR Figure 3 below illustrates a sponsored poll generated by our panel
COMPUTATIONAL MARKET RESEARCH team in order to get answers on a generic topic. In this case, the
As discussed above, in order to apply a scalable approach to parameter was the activity level of the users on social sites. The
market research and truly turn the field into computational market goal of the sponsored polls is mostly to enhance targeting
research, we need an automated mechanism to gather capabilities.
demographic and domain attributes on a large population of users.
Assume for example, that we need to find 5,000 people who ride
bikes and live in the area of Central London. If we already have
2,000 available users who already answered a biking poll in the
past and told us that they regularly ride bikes, the task at hand is
to identify the additional 3,000 users. Knowing that on average
only 5% of the London population is actually riding bikes,
sending traffic or email invites indiscriminately would lead to
sending invitations to 60,000 users with a screen-out rate of 95%.
The result is easy to imagine, the survey would cost a lot and the
customer would not be happy. In addition, the users being Figure 3: A "sponsored" poll on toluna.com
screened out would be annoyed and rapidly get tired of answering
Figure 4 below shows an organic poll generated by a user for no
even relevant surveys in the future. This is where user modeling
other purpose than social engagement.
comes into play. In order to automate market research, we need a
user model that consists of a set of demographic and domain
attributes. Such a user model is central to the automation of the
targeting and screening stages. It would allow the market research
company to price and route surveys properly and in a more
efficient way than is currently done in traditional market research.
Like with ads, a relevant survey can be appreciated by users,
while an irrelevant one is seen as spam.
3. AN EXAMPLE: TOLUNA.COM Figure 4: An organic poll on Toluna.com
Toluna.com is one of the most active social sites for online voting
and opinions. It is a Web2.0 site completely geared towards polls,
surveys and opinions of users. Toluna members can voice their Figure 6 below shows an organic topic launched by a user with no
opinions on any topic but they can also poll the community and incentive other than getting other people’s opinions. The topics
get other users’ opinions. Toluna currently counts more than 4 are answered as open-end text answers.
million active users worldwide. In November 2010 alone, users
voted 30 million times (e.g., a rate of 1 million votes a day), This combination of organic and sponsored polls as well as social
created 90,000 polls and topics and expressed about 700,000 full and financial incentives is what makes toluna.com unique. We
text opinions on a huge range of topics. Traffic is constantly advocate a computational market research approach by applying
growing, during that same month 180,000 new users registered to the following principles:
1. Gather users’ demographic and domain attributes about
users through organic polls and thus build an ever
growing user model
1
Some search engines allow their users not to be exposed to Ads 2. Leverage users’ model for automatic targeting and
relating to given market domains such as gaming, electronic etc. screening of sponsored polls.
4. to a survey. As a direct consequence, we significantly reduce for
Note that organic polls are in vast majority initiated by users in a users the frustration of being screened out and bring down the
natural manner (over 95% of all polls are organic) and are critical price per complete to an affordable level.
to successful users’ engagement on the site. Toluna editors can
also initiate polls which are not paid for by any customer but can
either increase engagement on hot topics or gather new attributes 4. CONCLUSION AND FUTURE
that are expected to be relevant to paying customers in the future. DIRECTIONS
Such polls often trigger more polls, user-initiated this time and We have described here how market research can truly become
thus continue enriching the user model at low cost. “computational” by merging the screening and targeting stages
and have explained how at toluna.com we used a mix of organic
and sponsored polls, as well as social and financial incentives to
build a scalable users’ base that supports this approach.
We believe that the qualification process however can still be
improved, so as to reduce the need for editors to generate organic
polls preemptively for expected domain of interests. Indeed, one
of the key challenges, of computational market research, which
also exists in display ads, is that we cannot predict ahead of time
which types of domains and associated features, our customers
will be interested in. For recurring features, we could consider
training for instance a “biker classifier” or a “two-dog owner”
classifier, but in the long run we need to be able to assemble
atomic features on the fly so as to generate “on demand” the
appropriate user models for a given survey.
Another venue of research is to use “users similarity” models,
where we seed our system with qualified users obtained possibly
via traditional “manual” methods and then identify similar users
based on similar behavior towards polls, surveys and opinions on
the site. We are considering traditional recommender systems
technologies for this purpose.
Figure 3: An organic opinion topic on toluna.com
We believe that computational market research is still in its
infancy and has much to learn from the progress of computational
Our primary effort at Toluna, consists of replacing the email advertising in the last few years.
invitation process by a “live” selection of traffic; as users on
toluna.com answer polls and give their opinions, some of their
responses automatically qualifies them and seamlessly transfer 5. REFERENCES
them to a sponsored survey. We are thus merging the targeting [1] Andrei Broder and Vanja Josifovski. Introduction to
and screening processes into a single “qualification” process as Computational Advertising, Yahoo! Research and Stanford
shown in Figure 5 below. University.
http://www.stanford.edu/class/msande239/lectures-
2010/lecture-07.pdf
[2] Arpita Ghosh, Preston McAfee, Kishore Papineni, and Sergei
Vassilvitskii. Bidding for representative allocations for
display advertising. CoRR, abs/0910-0880, 2009.
http://arxiv.org/abs/0910.0880
[3] Daphne Raban. The Incentive Structure in an Online
Information Market; Journal of the American Society for
Information Science and Technology, 2008.
http://gsb.haifa.ac.il/~draban/home/Raban_JASIST2008.pdf
[4] Frank Smadja, “Mixing Financial, Social and Fun Incentives
for Social Voting”, Webcentives, 1st International workshop
on Motivation and Incentives on the Web. Collocated with
Figure 5: The Computerized Market Research Funnel WWW09, Madrid, Spain. http://webcentives09.sti-
Merging the screening and targeting steps makes a huge innsbruck.at/proceedings-webcentives.pdf
difference both for the user and for the customer order the survey.
The first advantage is that we eliminate the static selection of
users and email invitations and instead, we send qualified traffic