Scalable advertising recommender systems

ì

Scalable
Adver,sing
v
Recommender
Systems

From
Search,
Display
to
Mobile,
Social
and
TV

By
Joaquin
A.
Delgado,
PhD.

ACM
San
Francisco
Bay
Area
Professional
Chapter

Disclaimer

ì  The
content
of
this
presenta,on
are
of
my
own

personal
opinion
and
does
not
oﬃcially
represent

my
employer’s
view
in
anyway.
Included
content
is

especially
not
intended
to
convey
the
views
of
Intel

Media
(an
Intel
Corp
Subsidiary)
or
Intel

Corpora,on.

Objectives

ì  Demonstrate
the
strong
similari,es
between

adver,sing
and
recommender
systems

ì  Illustrate
some
of
the
techniques
used
to
build

large-‐scale
adver,sing
systems
that
can
be
used

to
build
eﬀec,ve
and
scalable
recommender

systems.

Agenda

ì  Introduc,on
to
Recommender
Systems

ì  Introduc,on
to
Adver,sing
Systems

ì  Example:
Video
Adver,sing
Exchange

ì  Ok,
So
How
Do
We
Scale?

ì  The
Business
of
Recommenda,ons

ì  The
Crux
of

Metrics
and
Evalua,on

ì  Q&A

Introduction
to
Recommender
Systems

Recommender
Systems

ì  Recommender
systems
or

recommenda.on
systems
(a.k.a.

recommenda,on
engines/plaYorm)
are

a
subclass
of
informa,on
ﬁltering

systems
that
seek
to
predict
the
'ra,ng'

or
'preference'
that
a
user
would
give
to

an
item
(such
as
music,
books,
or

movies)
or
social
element
(e.g.
people
or

groups)
they
had
not
yet
considered.

ì  Recommender
Systems
have
been
around
since
the
1980s

primarily
applied
to
ecommerce
and
various
social
and

media
services.

ì  E.g.
Movie
Recommenda,ons

ì 

Univ.
Minnesota,
MovieLens
(circa
1984)

2009
Ne?lix
$1M

Challenge

Evolution
of
Recommender
Systems

Problem
Item, j
User i
Interacts with
user features xi
(demographics,
browse history,
search history, …)
available with item features xj
(keywords, content categories, ...)
(i, j) : response yij
Algorithm selects
(explicit rating, implicit click/no-click)
Predict the unobserved entries based on
features and the observed entries

Algorithmic
Approaches

(1)
:

Collaborative
Filtering

Better performance for old users and old items
Does not naturally handle new users and new items (cold-start)

Algorithmic
Approaches
(2)

Content
Based
Classiﬁcation
Task

Intel
Conﬁden.al

Limitation: need predictive features
Bias often high, does not capture signals at granular levels

Other
Critical
Limitations

ì  Lack
of
Contextually-‐Aware
Recommenda,ons

ì  Recommenda,ons
do
not
happen
in
a
vacuum;
context

such
as
,me-‐of-‐day,
type/size
of
the
device,
geo-‐loca,on,

surrounding
content
and
even
more
granular
user

informa,on
(e.g.
behavioral
user
segments)
is
key
to

providing
more
relevant
and
,ming
recommenda,ons

ì  Scaling
Recommender
Systems
is
hard!

ì  Dimensionality
reduc,on
and
some
recent
map-‐reduce

implementa,ons
of
matrix
factoriza,on
and
ML

algorithms
are
a
step
in
the
right
direc,on,
yet
alone
have

not
been
tested
at
“Internet
Scale”

Recommender
System
Redux

ì  True
goals
of
a
Recommender
System

ì  Amaze
the
user
by
sugges,ng
cap,va,ng
content

and
useful
services
that
are
contextually
relevant

and
,mely

ì  Enable
further
mone.za.on
via
poten,al
up-‐sale

and
cross-‐sell
opportuni,es
of
content
and
services

that
actually
ma9er
to
the
user.

ì  Do
all
this
at
scale!

Introduction
to
Advertising
Systems

Advertising

ì  Adver.sing
is
a
form
of
communica,on
for

marke,ng
and
used
to
encourage,
persuade,
or

manipulate
an
audience
(viewers,
readers
or

listeners;
some,mes
a
speciﬁc
group)
to
con,nue

or
take
some
new
ac,on.
Most
commonly,
the

desired
result
is
to
drive
consumer
behavior
with

respect
to
a
commercial
oﬀering,
although
poli,cal

and
ideological
adver,sing
is
also
common.

Long
History
of
Traditional
Advertising

A
form
of
promo,on
that

uses
Internet

Technology
for
the

expressed
purpose
of

delivering
marke,ng

messages
to
aeract

customers.

Online
Advertising

The
Rise
of
Online
Adver,sing

Online
Advertising
Spending
Tops
$100

Billion
in
2012

Why
Online
Adver,sing?

Computational
Advertising

ì  Computa,onal
adver,sing
is
at
the
intersec,on
of

large
scale
search
and
text
analysis,
informa,on

retrieval,
sta,s,cal
modeling,
machine
learning,

op,miza,on,
and
microeconomics.
The
central

challenge
of
computa,onal
adver,sing
is
to
ﬁnd
the

"best
match"
between
a
given
user
in
a
given

context
and
a
suitable
adver,sement.

ì  Depending
on
the
deﬁni,on
of
"best
match"
this

challenge
leads
to
a
variety
of
massive
op,miza,on

and
search
problems,
with
complicated
constraints.

Key
Enabling
Technology

ì  Systems
that
Scale

–  Distributed
Compu,ng

–  Distributed
Data
Processing

–  No-‐SQL/New-‐SQL
Databases

ì  Marketplace
Design

•  Auc,on
and
Game
Theory

•  Yield
Op,miza,on

•  Bidding
Agents

ì  Connec,ng
Markets

•  Real-‐,me
Bidding
(RTB)

ì  Pervasive
Internet
Compu,ng

•  Prolifera,on
of
Internet
Connected
Devices

The
World
of
Online
Adver,sing

•  Text

•  Image

•  Rich
Media

•  Video

•  Computer

•  Tablet

•  Phone

•  Television

•  Search

•  Display

•  Email

•  Social

•  Brand

•  Performance

Objec,ve
Channel

Format
Device

UX:
In-‐App
or
In-‐Browser

The
Marketplace

Audiences

Adver,sing
Opportuni,es

Publishers

Service
Providers

Adver,sers

Ads

Search
Keyword

Geo-‐loca,on

Contextual

Behavioral

Retarge,ng

Data
is
King!

How
Audiences
are
Selected?

Delivery
Options
and
Market
Types

GD
means
Guaranteed
Delivery
and

is
synonymous
to
brand,
wholesale

and
ﬁxed-‐price
online
adver,sing.

NGD
means
Non-‐Guaranteed

Delivery
and
is
synonymous
to

performance,
retail,
spot-‐market

(auc,on-‐base)
online
adver,sing.

How
are
ad
opportuni,es
priced?

–  CPM
(Cost
Per
Mile),
also
called
"Cost

Per
Thousand”
(CPT)
,
is
where

adver,sers
pay
per
impression
or

exposure
or
of
their
message
to
a

speciﬁc
target
audience.

–  CPC
(Cost
Per
Click)
is
also
known

as
pay-‐per-‐click
(PPC).
Adver,sers
pay

each
,me
a
user
clicks
on
their
lis,ng

and
is
redirected
to
their
website.

–  CPA
(Cost
Per
Ac.on)
or
cost
per

acquisi,on
adver,sing
is
performance

based
and
is
common
in
the
aﬃliate

marke,ng
sector
of
the
business

Advertising
Funnel
and
Marketing

Strategies

Brand
Adver,sing
Performance
Adver,sing

Bidding
and
Yield
Op,miza,on

Real-‐,me
Bidding
(RTB)
facilitates
the
connec,on
of
Supply
and

Demand
from
diﬀerent
private
marketplaces

Summary

Channel
Market
Formats
Pricing
Devices
Targe.ng
UX

Search
NGD
Text
CPC
All
Keyword,

Geo-‐loca,on

Browser

Display
GD,
NGD
All
All
All
All
Browser,

In-‐App

Social
NGD
Text,

Image

CPC
All
Behavioral,

geo-‐loca,on.

contextual,

retarge,ng

Browser,

In-‐App

Email
GD,
NGD
Text,

Image

CPM,
CPL
All
Geo-‐loca,on,

behavioral,

retarge,ng

Email
App

Example
Ad
System:
Video
Exchange

3d
party
Data
is
used
To
Iden.fy
a
User
and
Matches
It
to
Adver.ser
Demand
via

Impression
Level
Bidding

User
visits
pubs
in
an

exchange
auc.on

marketplace

User
clicks
on
video

player
to
play
Video

Exchange
simultaneously
pings

all
twelve
3rd
party
data

partners
to
see
whether
they

have
relevant
demographic

and/or
behavioral
informa.on

matching
the
target
to

available
impressions
across

the
exchange

Exchange
matches

adver.ser
demand
to

qualiﬁed
users

The
ad
server
serves
a

relevant
pre-‐roll
to
that

user
in
real
.me.

Match

Responding
to
a
Pub
Ad
Call

Exchange/
Network
Publisher
P
Yo!
I
need
an
ad!

No
prob
Home
Slice!

Here’s
a

XML
doc

with
all
the
info
to

execute
the
ad
010011010101

Publisher
ad call
1
2
Exchange/Network
responds
with XML doc
The
XML
ﬁle
is
the
recipe
to
execute
the
video
ad!
31

Pub
follows
XML
ﬁle
recipe
to
execute
ad

Publisher
pagePublisher
P
I
now
have

my
XML
doc
recipe
…

Now

I’ll
follow

the
recipe

to
show

the
ad

1
3rd Party
Video Ad Server
2
Request
for
video ad file
End User
Pre-roll ad plays
&
beacon events
provide metrics
4
3 Video ad file
sent to the
Publisher’s video player
32

More
Players,
More
redirec,ons

Adver,sers
use
their
“primary”
ad
server
to
manage
the
campaign
and
then
hand
oﬀ

the
ad
calls
to
a
“secondary”
rich
media
ad
server,

ﬁnally
pulling
the
ad
from
a

content
delivery
network
as
in
the
diagram
above.
This
type
of
daisy-‐chaining
is
also

quite
common
with
ad
exchanges
that
handle
remnant
inventory,
thus
crea,ng
even

more
redirec,ons.

OK,
So
How
Do
We
Scale?

ì  What
is
the
Right
Architecture?

ì  What
are
the
best
Data
Structures?

ì  What
family
of
Algorithms?

35

Impression-‐Processing

Server

Index,
Model

Par..ons

impression

Bid-‐Genera.on

Server

.
.
.

bids,

auc,on
info

Bid-‐Genera.on

Server

Publisher

Data

Scalable
FE
Serving

Architecture

36

Bid-‐Generation
Server
Farm

Bid-‐Genera.on

Server
.
.
.

Bid-‐Genera.on

Server

Bid-‐Genera.on

Server

Bid-‐Genera.on

Server
.
.
.

.
.
.
.
.
.

#columns
=
#par,,ons
=
M

#rows
=
#replicas
=
N

37

Bidding
System
Structure

ì  Impression-‐Processing
Server
annotates
the

submieed
impression,
scaeers
the
impression
to
a

set
of
Bid-‐Genera,on
Servers,
gathers
top
bids

from
local
auc,ons,
and
computes
the
overall
top

bids
for
the
impression
by
running
a
global
auc,on

ì  Each
Bid-‐Genera,on
Server
works
on
a
par,,on
of

demand
data,
generates
bids
for
a
given
impression

based
on
that
data
par,,on,
conducts
a
local

auc,on
across
those
bids,
and
returns
local
winners

and
the
corresponding
auc,on
info

Uniﬁed
BE
Data
Analytics

ì  Descrip,ve
Analy,cs

ì  OLAP

ì  Reports
&
Visualiza,on

ì  Predic,ve
Analy,cs

ì  OLTP

ì  Indexes
and
Models

ì  Ranking

ì  Predic,on

ì  Classiﬁca,on

ì  Op,miza,on

Analyzing
an
Ad
Request
Flow

1.  Eligibility

2.  Ranking

(Auc,on)

3.  Delivery

4.  Display
Ad

EXCHANGE

Eligibility:
The
Ad
Matching
Problem

ì  BE: age ∈ {10,20} & country ∉ {US}
ì  S: age=20 & country=FR & gender=F
ì  Given an assignment S, find all matching
Boolean expressions (BEs)

Background:
Inverted
Indexes

ì  Pos,ng
lists
of
occurring
terms

(tokens)
with
list
of

documents:posi,ons

ì  Used
to
match
queries

ì  Tokens

ì  Boolean
operators

ì  Search
returns
documents

with
relevance
score

Indexing Boolean Expressions
ì  E1: A ∈ {1}
ì  E2: A ∈ {1} & B ∈ {2} & C ∈ {3,4}
ì  S: A=1 & B=2
Key
Pos.ng
List

(A,1)
E1,E2

(B,2)
E2

(C,3)
E2

(C,4)
E2

ID
Expression
K

1
age
∈
{3}
∧
state
∈
{NY
}

2

2
age
∈
{3}
∧
gender
∈
{F}

2

3
age
∈
{3}
∧
gender
∈
{M}

∧
state
∉
{CA}

2

4
state
∈
{CA}
∧
gender
∈

{M}

2

5
age
∈
{3,
4}
1

6
state
∉
{CA,NY
}
0

K
Key
and
UB
Pos.ng
List

0

(state,CA),
2.0
(6,
∉,
0)

(state,NY
),
5

(6,
∉,
0)

Z,
0
(6,
∈,
0)

1

(age,
3),
1.0
(5,
∈,
0.1)

(age,
4),
3.0
(5,
∈,
0.5)

2

(state,NY
),
5
(1,
∈,
4.0)

(age,
3),
1.0
(1,
∈,
0.1)
(2,
∈,

0.1)
(3,
∈,
0.2)

(gender,
F),
2
(2,
∈,
0.3)

(state,CA),
2.0
(3,
∉,
0)
(4,
∈,
1.5)

(gender,M),
1.0
(3,
∈,
0.5)
(4,
∈,

0.9)

Figure
1:
A
set
of
conjunc,ons

Figure
2:
Inverted
list
for
Figure
1

43

K-‐Inverted
List
Construction

Ranking
Phase
I:
Top-‐K
Selection

ì  Search
algorithm
for
DNF/CNF
BEs
with

relevance
ranking

ì  The
score
of
a
BE
E
reﬂects
the
“relevance”

of
E
to
an
assignment
S.
For
example,
a
user

interested
in
running
might
be
more

interested
in
an
adver,sement
on
shoes

than
an
adver,sement
on
ﬂowers

Example:
Scoring

ì  S=
{age=1,
state=NY,
gender=F}

ì  Ws=(1,2,3)

ì  Score(BE1)=0.1*1+2*4
=
8.1

ì  Score(BE2)=0.5*1+0.3*3
=
1.4

K
Key
and
UB
Pos.ng
List

2

(state,NY
),
5
(1,
∈,
4.0)

(age,
3),
1.0
(1,
∈,
0.1)
(2,
∈,
0.5)

(gender,
F),
2
(2,
∈,
0.3)

ID
Expression
K

1
age
∈
{3}
∧
state
∈
{NY
}

2

2
age
∈
{3}
∧
gender
∈
{F}

2

Matching
Requires
Two
Kinds
of
Indexes

Example:
Ad
Matching

•  Assignment [S]:
age=20 &
country=FR &
gender=F
•  Boolean
Expression[SF]:
age ∈ {10,20} &
country ∉ {US}
Given an assignment
S, find all matching
Boolean expressions
(SFs)
•  Boolean
Expression[DF]:
ad_size ∈
{800x400,200x50}
& type ∉ {flash}
•  Assignment [D]:
crtv_tag =sports &
size=800x400 &
type=Flash
Given a Boolean
Expression DF, find all
matching Assignments
(Ds)
Return al matching Ad Units satisfying the
two-way match!!
Opportunity Query =
Supply Attributes (values)^ Demand Filters (BE)
Indexed Ad Units
Demand
Attributes
(values)
Supply
Filters
(BE)

Ranking
Phase
II:
Auction

ì  Bids
are
computed
as
an
op,miza,on
based
on

objec,ves
subject
to
budget
constraints.

vG
=
X
g
g
vg
action-rate

goal value

goal

Predictive
Analytics
and
Models

ì  ML
and
CF
techniques
can
be
used
to
compute

ì  Weights
for
Relevance
Ranking

ì  Assigned
to
BE
clauses
and
assignment
pairs

ì  Ac,on-‐Rates

ì 
E.g.
Response
predic,on:
what
is
the
probability
of
a
user

comple,ng
an
ad
view,
clicking
or
conver,ng

ì  Op,miza,on

ì  Delivery:
Availability
and
Pacing
based
on
Budgets

ì  Revenue/ROI
based
Op,miza,on

ì  Explora,on-‐Exploita,on
is
required
to
“learn”
new
signals.

ì  Resul,ng
models
should
be
par,,oned
and
loaded
into

Bidding
Servers

50

The
Business
of
Recommendations

ì  Recommenda,ons
impact
your
business

ì  Create
campaigns
that
target
certain
audiences,

sec,ons
of
the
applica,on,
geo-‐loca,on,
etc.

ì  Use
recommenda,ons
as
a
way
to
do
promo,ons
as

well
as
upsell
and
cross-‐sell

ì  Not
all
items-‐ac,ons
are
created
equal

ì  Assess
the
value
of
the
goals.
Bidding
agents
will

take
care
of
the
rest

ì  Some
items
have
a
limited
life-‐span
(e.g.
window
of

availability).
Be
sure
to
represent
this
as
constraints

or
budgets

Summary

Adver.sing
Recommender
Systems

Targe,ng
Constraints

Budget
Availability

Bid
Relevance

Auc,on
Selec,on

Model
Model

The
All
Encompassing
Data
Engine

Data

Engine

Search
&
Discovery

Recommenda,ons
Adver,sing

Intelligence

Data
Engine
=
Data
Core
+
Analy,cs

The
Crux
of
Metrics
and
Evaluation

Business

• Revenue

• User
Experience

• Product
and
Service

Ra,ng

Systems

• Conversion
Rate

• ROC
Curves

• Precision

• Recall

User

• Relevance

• Enjoyable

• Novelty

• Originality

Intel
Conﬁden.al

Bucket
Testing
and
Oﬄine
Evaluation

Ad
Server

To
be

evaluated

Ad
Server

(Random

Bucket)

Traces
(100%)

Event
Data

(impr,
click,

Conv,
prob)

Replayer

Ad
Calls

HTTP

Response

Join
Final
Data
For

Evalua,on

The
Big
Fish:
OTT
Television

•  Online
TV
and
Video-‐on-‐Demand
is

here
to
stay

•  Star.ng
to
tap
into
tradi.onal
TV/
Cable
adver.sing
Budgets

•  Viewership
+
Web
Data
will
power

new
forms
of
Online
Adver.sement

References

ì  Indexing
Boolean
Expressions

ì  Computa,onal
Adver,sing
and
Recommender

Systems

ì  A
Market-‐Based
Approach
to
Recommender

Systems

ì  ICML’11
Tutorial
on
Machine
Learning
for
Large

Scale
Recommender
Systems

Scalable advertising recommender systems

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Scalable advertising recommender systems

Similar a Scalable advertising recommender systems (20)

Último

Último (20)

Scalable advertising recommender systems