1. Over The Top TV (OTT), Big Data and The Future Of
Predictive Buyer Analytics
Steve Wong
Siemens CVC
@stevewongLA
Dr. Bob Deutsch
Brain-Sells
@bobdeutsch8
Arnav Mendiratta
USC
@arnavmendiratta
Los Angeles, CA.
Abstract - The consumption of movies and television
programs is shifting to an Over-the-Top TV (OTT) delivery
model. The real-time information available from the Over-
the-Top TV (OTT) applications on mobile devices, tablets
and smart TV is enormous. With the availability of
inexpensive processing power and the efficacy of real-time
analysis technologies for big data, and the ability to analyze
the data from the applications on mobile devices, tablets,
smart TV and social media makes it possible to match buyer
profiles to the viewer profiles of Over-the-Top TV (OTT)
movies and television programs. These real-time matching
the buyer profile to the viewing profile set the groundwork to
predict buying habits. The metadata managed by Over-the-
Top TV (OTT) platforms can be analyzed in real-time. In this
paper we will present the challenges and opportunities in the
Big Data Predictive Buyer Analytics with Over-the-Top TV
(OTT) applications.
WHAT IS OVER THE TOP TV (OTT)?
The internet has revolutionized the way people consume
movies and television shows around the world. (The Internet
was fully commercialized in the U.S. by 1995). Not only has
viewing behavior changed (binge viewing & catch-up), but
also the devices we consume media on have changed. (The
first iPad was released on April 3, 2010 and Smart TVs
became the dominant form of television by late 2010s). Over
the Top TV (OTT) is the delivery of audio, video, and other
media over the Internet without the involvement of a
Multiple-System Operator (MSO), Direct Broadcast Satellite
(DBS), multichannel video programming distributors
(MVPDs), or broadcast television stations (RF). Over the top
TV (OTT) content is delivered to an end-user device. The
ISP’s only role is transporting IP packets not controlling the
content. Over the top TV (OTT) has created a new world of
flexibility for content rights owners. The content rights
owners now have the option to supplement the revenue from
distribution window agreements with a direct business to
consumers (B2C) with the launch of an Over the top TV
(OTT) application and a solid marketing plan.
WHAT IS BIG DATA?
Today ‘Big Data” no longer means large number of rows
and columns. Today, companies use big data analytics to
drive marketing. The sheer assortment of data available from
a user’s social media and interactions with a mobile device
along with the metadata from the content the user is
interacting with on the devices requires new techniques to
store, manage and analyze this data. The data to analyze in
real time is now unstructured, meaning that that data is not
organized in a pre-defined manner. An OASIS Standard
(Organization for the Advancement of Structured
Information Standards) as of March 2009, is the only
industry standard for content analysis. Recently, content
analysis has been increasing in the use of deep analyze and
the understanding of media consumption. Now, data analysis
needs to analyze a large amount of contextual information
because of new ad driven Over the Top TV (OTT), social
media and mobile device business models. Big data requires
the use of a variety of statistical techniques that extract
values from data and analyze current and historical facts to
make a prediction about future, or otherwise unknown
events. Data mining is antecedent to any data analysis
technique and focuses on modeling and knowledge
discovery for predictive purposes. Data sets are growing
rapidly in part because they are increasingly gathered by
cheap and numerous information sensing mobile devices,
aerial remote sensing, software logs, digital cameras,
microphones, radio frequency identification (RFID) readers
and wireless sensor networks. Big Data represents
information that is high Volume, Velocity, Variety, and
2. Veracity and requires specific technology and analytical
methods for its transformation into Value.
Is Content Analysis Enough?
Content analysis is a technique for the contextualized
interpretations of data produced with a goal of producing a
valid inference. Content analysis has been applied to a
variety of applications from flagging profane text to
measurement of success in public relations campaigns [1].
Mimetic Convergence, created by Fátima Carvalho for the
comparative analysis of electoral proclamations on free-to-
air television [2]. Content analysis is considered by some to
be quasi-evaluation because content analysis judgments need
not be based on value statements if the research objective is
aimed at presenting subjective experiences.
How Was Data Used In The Old Days Of TV?
The television industry has been based on the “estimated”
value of a viewer. In the old days, this value was based on
estimating the viewership of a program derived from a
“small” sample of the audience in the designated market area
(DMA) agreeing to have a meter attached to their “television
to determine if the television was on and what channel the
television was tuned to at that quarter hour. This was
combined with a “small” sample of the audience in that
designated market area (DMA) filling in a diary (book) with
their sex, age, race and what channel, call letters and
program title they watched during the quarter hour.
The data of the most recent demographics would be
combined with the household viewing (HUTS) of the
months and program (or daypart) so the advertiser could
determine an estimated rating for the program (or daypart).
This rating and the cost per point (CPP) for the day part
would be negotiated between the ad sales person and the
media buyer to determine the cost of the ad that the buyer
would be willing to pay. After the ad ran, the buyer would
compare the actual rating that the program achieved in the
quarter hour the ad ran. If the advertisement did not achieve
the ratings negotiated, the TV station would owe a “Make
Good” to the advertiser.
Management of Big Data in OTT
In an OTT solution, the distribution of a movie or TV show
to a customer might not require direct involvement of a
Telco; although the OTT backend management system
should be a Telco grade software as a service in the cloud to
manage the enormous amount of real time big data. Also,
this OTT solution should be able to generate real time
reports on viewership. This software as a service should be a
uniformed one stop shop for the overall management of the
content. This includes digital broadcast rights and user apps
management to determine if the user has the right to view
the content. The software should also manage the
information that is analyzed about the user to feed this big
data to recommendation engines and ad servicers.
The problem with content suggestion and advertisement
engines is that it not only requires fast and real time analysis
of the user data, but also constantly ingest the data, clean and
enriches it to make it usable for OTT client and then present
it to user in an engaging manner. At the backend of this
ecosystem, there is also predictive analytics, which include
mining and warehousing the data, machine learning, and
sentiment analysis on the metadata and the user data
obtained from social media.
How is Big Data helping Over the Top TV?
With advent of Over the Top TV (OTT) viewing on mobile
devices, the obsession with social media and an increase of
in app purchases, the pool of information available about a
viewer and their viewing and buying habits have made it
easier and more effective to leverage this information to
predict a viewers buying behavior. Combining Over the Top
TV (OTT) data with social media data is possible by using a
social media log on for the OTT app on a mobile device.
This could allow a business to combine viewer data with
social media data.
With Graph API provided by Facebook and REST API by
Twitter, it has become easier to get to know a viewer. With
these APIs tied into an OTT app, information about user’s
age, gender, friends, status, tweets, follows, interests, or
likes could be combined into a data pool. A big data analysis
system could use this information along with how viewers
interact to certain videos, movies, products, or services when
they mention them on social media. This analysis could be
done for each viewer and is a great asset to design specific
marketing and advertising campaigns directed at these
individual viewers. Enormous information can be obtained
by the viewer’s mobile device that is playing the OTT App.
What Data Is Available From An Application?
Advertisers are naturally interested in understanding the
individuals that take advertising actions. IDFAs (and their
Android siblings, Android Advertising IDs) help an
advertiser identify the specific phone where the ad action
takes place. Every iOS device comes with a Unique Device
Identifier (UDID). Until recently, the UDID allowed
developers and marketers to track activity on the device,
such as app purchases. The UDID has been replaced with the
Identifier for Advertising (IDFA). Inventory bid requests
from Android devices pass the AAID, which provides the
same type of device-specific, unique, resettable ID for
advertising as the IDFA. The ID for tablet devices with
multiple users may also be unique per user.
3. Ad exchanges support passing the Identifier for Advertising
(IDFA) or the Google Advertising ID (AAID) in mobile
application inventory bid requests. When consumers take
actions as a result of ads, like clicking a banner, playing a
video, or installing an app, media companies can pass the
IDFA with information about the consumer action that took
place as a result of the advertising. Most media companies
do pass IDFAs. Some media companies, including some
large social networks, do not pass device IDs to advertisers,
but do allow you to target specific IDs within their
properties. The IDFA enables an advertiser to individually
target specific individuals that have taken actions in the past.
How Can You Target An Audience With An App?
The most direct method is to tie a device directly to
purchase-based or public-record data using device IDs.
Another method is to collect opt-in data directly from
consumers (e.g. age, gender, zip code) usually through
registration. A third method is to infer information about a
user based on usage behavior (e.g. types of apps used most
frequently, time of day, websites visited), which is
commonly used to reach users based on their interests.
There are many ways to obtain location from a mobile
device. Another strategy is to observe a device’s location
over time to build an audience profile (e.g. device moves
around the country, often on weekdays = business traveler);
Tracking and recording such observations help simplify the
analysis of data. Now, instead of finding patterns for each
user for all of the content available on the OTT platform,
which is astronomical amount of computation, the
algorithms could instead find the patterns between a similar
group of people and the content that they would want to
watch. This is done by using clustering algorithms rather
than the supervised learning techniques.
When "Limit Ad Tracking" is turned on, apps are still
allowed to collect the IDFA, but they are supposed to (honor
system) not use it to Target ads. Nothing stops servers from
continuing to Track a device (i.e., collect, store, and
aggregate personal, device, or behavioral data along with the
IDFA). If a user opts in and gives permission to an app to
access Location, Contacts, or Calendar, that's as good or
better for tracking than a simple UDID or IDFA
Fingerprinting may not be accurate to track viewers over a
long-term, but it could be very accurate for tracking
conversions in a short-term time window (24 hours).
Social Media Research in Entertainment
Previously, there has been research [3] where 2.89 million
tweets from 1.2 million users were analyzed using natural
language processing algorithms to predict the success of
movies on box office. There have also been models to
correlate sentiments in blog posts with movie box-office
scores [4]. Most of the current work in this area is very
narrow and articulate to a very specific problem. Our
problem is much more complex in the way that we have so
much data and so much variety of data for each user.
Predicting models for real time ad-hoc advertisement
delivery and content suggestions require exhaustive
correlation computations. To add to this we have sentiment
analysis and recommendations based on traditional
techniques that are in use today - MPAA ratings, run time,
release dates, number of screens to which movie debuted,
presence of certain actors/actresses, genre and similar meta
data. According to FreeWheel Video Monetization Report
Q2 2015 [5], OTT streaming devices accounted for 10% of
video ad views in second quarter of this year and with 194%
year-over-year growth.
As an example, if we know a viewers ‘Likes’ and ‘Follows’
on certain TV show activity or product, advertisement
recommendation using a simple correlation model can be
used. Correlation refers to strength of a relationship or
interdependence between two sets of data or two variables.
One set being the user’s data based on interests and other
being the one collected through platforms like Gracenote
and metadata in the OTT client application. The correlation
coefficient is a mathematical value between -1 (negative
correlation) and +1 (positive correlation) that reflects the
strength of linear relationship between these two sets of data.
A simple formula given by Karl Pearson can be used to find
this correlation (r-value):
How are Real People Different From Consumers?
For Big Data to realize its potential, a new paradigm must be
conceived and developed that captures how real people,
living real lives, real-time, become “inclined” to make a
particular purchase. Dr. Bob Deutsch, a cognitive
anthropologist and founder of the consulting company,
Brain-Sells, has been working this problem for the last two
decades [6].
First, Dr. Bob’s research suggests advertisers now need to
evolve from considering products as brands to considering
"person-as-brand." With the growth of social media and the
focus on “Me” (one reflection being the emergence of the
selfie), every person wants to be its own brand—to perform,
and to be liked, looked at, followed. With "me-as-brand," the
secret to corporate success is to understand buying a product
is not simply a response to viewing a sales pitch with
attendant product attributes, but is a matter of entering
people’s already ongoing self-narrative. A narrative has a
plot, has non-stereotypical characters with a point of view,
4. has a mise-en-scene, has obstacles and meaningful conflict,
has surprises (non-linearity), and has a sense of an ending
(that intimates a new beginning). In other words, we are
talking about the structure of good storytelling. The mind is
primed to respond to “story” and a good story is such that a
viewer can identify with and insinuate their own story into.
Connection From Attachment
The task now is not so much, how a brand or product
presents its pitch, but how it enters an ongoing self-narrative
a person embodies already. Being just informative and
entertaining is not enough for an ad or any form of
communication to be successful. Counting eyeballs is not
predictive.
The use of Big Data from OTT should be used to capture the
processes of how a person creates models of the world and
then translates that into "my world." Aattachment occurs
only when a person's story about him- or her-self is merged
with the story that individual has about the product. This
self-directed merging of self-story and product-story
produces a rock-solid bond with the product. This isn't
product loyalty. It's self-loyalty, because in this attachment
process, products are not an end-point, but they are a
manifest in of the product in the person themselves. This
intrinsic self-expansive aspect to a product (as lived by real
people) is exactly what is responsible for the deep, abiding
emotional connection between a person and a product.
Interest vs. Identify
Neurological experiments have demonstrated that when we
identify with another -- when we feel something is part of us
-- the brain's medial prefrontal cortex (The brain region
involved with self-definition) is activated. In this case, the
product is felt to fit into the picture a person has of himself
or herself. A reverie about self is provoked in which a self-
referring narrative envelops the person. In contrast, when a
person feels the attributes of a product are simply good but
doesn't identify with it, the brain region known as the
putamen lights up. This experience is rewarding but not self-
involving. The object remains external.
Preference and purchase come from identification, not from
comparing product attributes to see what product is best or
even from the fulfillment of one’s interests.
We humans crave the satisfaction that comes when our
identities are understood and reflected back. This
identification process is not wholly logical. It is a process
that is:
· Emotionally-based (and emotion trumps rational,
linear, objective thought)
· Symbolically-energized (wherein a referent’s
emotional associations are a major driver)
· Metaphorically-derived (connecting seemingly non-
related things)
· Narratively-constructed (with little regard to
objective time or linear causality).
This is the nature of the human “attachment” process that
predictive analytics must come to terms with if it is ever
going to really be predictive!
How Attachments Are Formed Between
Products and People
Attachments are generated from the simultaneous activation
of three feelings:
Familiarity: a person must perceive there is something
about you that is instantly recognizable as like her or him
[It’s Like Me].
Appeasement: A person must perceive that you understand
some things about them, and must feel their point of view is
considered and appreciated [It Likes Me]; a sense of trust
develops from this.
Power: A person must perceive your product as different
5. from them and sense that in that difference you can help
them be more. A person feels that you can help them make
manifest something that is already in them (in their self-
story), but latent.
To be successful an advertiser must feed people's appetite
for self-expansion.
The Potential Of Over The Top TV (OTT)
The storehouses of ever-increasing “big data” from OTT and
social media apps on mobile devices may reveal sociological
laws of human behavior enabling the prediction of buying
habits, just as physicists and chemists can predict certain
natural phenomena. However, this can only be done if we
push through to the truth of how human behavior and
emotion really work, and not cover up the incompleteness of
our current marketing perspective with mountains of data.
Size sometimes doesn’t matter.
In some cases, the accessibility and computerization of huge
databases has begun to spur the development of new
statistical techniques and new software to manage data sets
with trillions of entries or more. But what are we managing?
The deeper question remains, of whether it will be possible
to discern behavioral laws that can predict human consumer
behavior. People are not wholly rational, objective, or linear
machines. People are makers and gatherers of meaning. How
and why people attach to ideas or things is based on peoples’
temperament, their current circumstance, their deep default
narratives about how they see themselves and the world, the
background context – cultural and societal – they live in, and
how all these things blend into a unity. When ‘Big Data’
even begins to face up to this reality, that would be a “BIG
BANG!” Then and only then could marketers cash in, all the
way to the bank.
References
[1] Tipaldo, G. (2014). L' analisi del contenuto e i mass media. Bologna, IT:
Il Mulino. p. 42 ISBN: 9788815248329
[2] Lipset, Seymour M.; Stein Rokkan (1967). Cleavage Structures, Party
Systems, and Voter Alignments," Free Press. pp. 1–64.
[3] S. Asur and B. A. Huberman, “Predicting the Future With Social
Media”, Social Computing Lab, HP Labs
[4] G. Mishne and N. Glance. “Predicting movie sales from blogger
sentiment”, AAAI 2006 Spring Symposium on Computational Approaches
to Analysing Weblogs, 200
[5] FreeWheel Video Monetization report Q2 2015, “The new prime time is
anytime”
[6] Deutsch, Bob“For Success in Social Media, Conversation Is Not
Enough: You Need Narrative,” Fast Company (Co.Create),
16 December 2014.