In this presentation I will talk about the design of scalable recommender systems and its similarity with advertising systems. The problem of generating and delivering recommendations of content/products to appropriate audiences and ultimately to individual users at scale is largely similar to the matching problem in computational advertising, specially in the context of dealing with self and cross promotional content. In this analogy with online advertising a display opportunity triggers a recommendation. The actors are the publisher (website/medium/app owner) the advertiser (content owner or promoter), whereas the ads or creatives represent the items being recommended that compete for the display opportunity and may have different monetary value to the actors. To effectively control what is recommended to whom, targeting constraints need to be defined over an attribute space, typically grouped by type (Audience, Content, Context, etc.) where some associated values are not known until decisioning time. In addition to constraints, there are business objectives (e.g. delivery quota) defined by the actors. Both constraints and objectives can be encapsulated into and expressed as campaigns. Finally, there there is the concept of relevance, directly related to users' response prediction that is computed using the same attribute space used as signals.
As in advertising, recommendation systems require a serving platform where decisioning happens in real-time (few milliseconds) typically selecting an optimal set of items to display to the user from hundreds, sometimes thousands or millions of items. User actions are then taken as feedback and used to learn models that dynamically adjust order to meet business objectives.
This is a radical departure from the traditional item-based and user-based collaborative filtering approach to recommender systems, which fails to factor-in context, such as time-of-day, geo-location or category of the surrounding content to generate more accurate recommendations. Traditional approaches also fail to recognize that recommendations don't happen in a vacuum and as such may require the evaluation of business constraints and objectives. All this should be considered when designing and developing true commercial recommender/advertising systems.
Speaker Bio
Joaquin A. Delgado is currently Director of Advertising Technology at Intel Media (a wholly owned subsidiary of Intel Corp.), working on disruptive technologies in the Internet T.V. space. Previous to that he held CTO positions at AdBrite, Lending Club and TripleHop Technologies (acquired by Oracle). He was also Director of Engineering and Sr. Architect Principal at Yahoo! His expertise lies on distributed systems, advertising technology, machine learning, recommender systems and search. He holds a Ph.D in computer science and artificial intelligence from Nagoya Institute of Technology, Japan.
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Scalable advertising recommender systems
1. ì
Scalable
Adver,sing
v
Recommender
Systems
From
Search,
Display
to
Mobile,
Social
and
TV
By
Joaquin
A.
Delgado,
PhD.
ACM
San
Francisco
Bay
Area
Professional
Chapter
2. Disclaimer
ì The
content
of
this
presenta,on
are
of
my
own
personal
opinion
and
does
not
officially
represent
my
employer’s
view
in
anyway.
Included
content
is
especially
not
intended
to
convey
the
views
of
Intel
Media
(an
Intel
Corp
Subsidiary)
or
Intel
Corpora,on.
3. Objectives
ì Demonstrate
the
strong
similari,es
between
adver,sing
and
recommender
systems
ì Illustrate
some
of
the
techniques
used
to
build
large-‐scale
adver,sing
systems
that
can
be
used
to
build
effec,ve
and
scalable
recommender
systems.
4. Agenda
ì Introduc,on
to
Recommender
Systems
ì Introduc,on
to
Adver,sing
Systems
ì Example:
Video
Adver,sing
Exchange
ì Ok,
So
How
Do
We
Scale?
ì The
Business
of
Recommenda,ons
ì The
Crux
of
Metrics
and
Evalua,on
ì Q&A
6. Recommender
Systems
ì Recommender
systems
or
recommenda.on
systems
(a.k.a.
recommenda,on
engines/plaYorm)
are
a
subclass
of
informa,on
filtering
systems
that
seek
to
predict
the
'ra,ng'
or
'preference'
that
a
user
would
give
to
an
item
(such
as
music,
books,
or
movies)
or
social
element
(e.g.
people
or
groups)
they
had
not
yet
considered.
7. ì Recommender
Systems
have
been
around
since
the
1980s
primarily
applied
to
ecommerce
and
various
social
and
media
services.
ì E.g.
Movie
Recommenda,ons
ì
Univ.
Minnesota,
MovieLens
(circa
1984)
2009
Ne?lix
$1M
Challenge
Evolution
of
Recommender
Systems
8. Problem
Item, j
User i
Interacts with
user features xi
(demographics,
browse history,
search history, …)
available with item features xj
(keywords, content categories, ...)
(i, j) : response yij
Algorithm selects
(explicit rating, implicit click/no-click)
Predict the unobserved entries based on
features and the observed entries
9. Algorithmic
Approaches
(1)
:
Collaborative
Filtering
Better performance for old users and old items
Does not naturally handle new users and new items (cold-start)
10. Algorithmic
Approaches
(2)
Content
Based
Classification
Task
Intel
Confiden.al
Limitation: need predictive features
Bias often high, does not capture signals at granular levels
11. Other
Critical
Limitations
ì Lack
of
Contextually-‐Aware
Recommenda,ons
ì Recommenda,ons
do
not
happen
in
a
vacuum;
context
such
as
,me-‐of-‐day,
type/size
of
the
device,
geo-‐loca,on,
surrounding
content
and
even
more
granular
user
informa,on
(e.g.
behavioral
user
segments)
is
key
to
providing
more
relevant
and
,ming
recommenda,ons
ì Scaling
Recommender
Systems
is
hard!
ì Dimensionality
reduc,on
and
some
recent
map-‐reduce
implementa,ons
of
matrix
factoriza,on
and
ML
algorithms
are
a
step
in
the
right
direc,on,
yet
alone
have
not
been
tested
at
“Internet
Scale”
12. Recommender
System
Redux
ì True
goals
of
a
Recommender
System
ì Amaze
the
user
by
sugges,ng
cap,va,ng
content
and
useful
services
that
are
contextually
relevant
and
,mely
ì Enable
further
mone.za.on
via
poten,al
up-‐sale
and
cross-‐sell
opportuni,es
of
content
and
services
that
actually
ma9er
to
the
user.
ì Do
all
this
at
scale!
14. Advertising
ì Adver.sing
is
a
form
of
communica,on
for
marke,ng
and
used
to
encourage,
persuade,
or
manipulate
an
audience
(viewers,
readers
or
listeners;
some,mes
a
specific
group)
to
con,nue
or
take
some
new
ac,on.
Most
commonly,
the
desired
result
is
to
drive
consumer
behavior
with
respect
to
a
commercial
offering,
although
poli,cal
and
ideological
adver,sing
is
also
common.
20. Computational
Advertising
ì Computa,onal
adver,sing
is
at
the
intersec,on
of
large
scale
search
and
text
analysis,
informa,on
retrieval,
sta,s,cal
modeling,
machine
learning,
op,miza,on,
and
microeconomics.
The
central
challenge
of
computa,onal
adver,sing
is
to
find
the
"best
match"
between
a
given
user
in
a
given
context
and
a
suitable
adver,sement.
ì Depending
on
the
defini,on
of
"best
match"
this
challenge
leads
to
a
variety
of
massive
op,miza,on
and
search
problems,
with
complicated
constraints.
21. Key
Enabling
Technology
ì Systems
that
Scale
– Distributed
Compu,ng
– Distributed
Data
Processing
– No-‐SQL/New-‐SQL
Databases
ì Marketplace
Design
• Auc,on
and
Game
Theory
• Yield
Op,miza,on
• Bidding
Agents
ì Connec,ng
Markets
• Real-‐,me
Bidding
(RTB)
ì Pervasive
Internet
Compu,ng
• Prolifera,on
of
Internet
Connected
Devices
22. The
World
of
Online
Adver,sing
• Text
• Image
• Rich
Media
• Video
• Computer
• Tablet
• Phone
• Television
• Search
• Display
• Email
• Social
• Brand
• Performance
Objec,ve
Channel
Format
Device
UX:
In-‐App
or
In-‐Browser
25. Delivery
Options
and
Market
Types
GD
means
Guaranteed
Delivery
and
is
synonymous
to
brand,
wholesale
and
fixed-‐price
online
adver,sing.
NGD
means
Non-‐Guaranteed
Delivery
and
is
synonymous
to
performance,
retail,
spot-‐market
(auc,on-‐base)
online
adver,sing.
26. How
are
ad
opportuni,es
priced?
– CPM
(Cost
Per
Mile),
also
called
"Cost
Per
Thousand”
(CPT)
,
is
where
adver,sers
pay
per
impression
or
exposure
or
of
their
message
to
a
specific
target
audience.
– CPC
(Cost
Per
Click)
is
also
known
as
pay-‐per-‐click
(PPC).
Adver,sers
pay
each
,me
a
user
clicks
on
their
lis,ng
and
is
redirected
to
their
website.
– CPA
(Cost
Per
Ac.on)
or
cost
per
acquisi,on
adver,sing
is
performance
based
and
is
common
in
the
affiliate
marke,ng
sector
of
the
business
28. Bidding
and
Yield
Op,miza,on
Real-‐,me
Bidding
(RTB)
facilitates
the
connec,on
of
Supply
and
Demand
from
different
private
marketplaces
29. Summary
Channel
Market
Formats
Pricing
Devices
Targe.ng
UX
Search
NGD
Text
CPC
All
Keyword,
Geo-‐loca,on
Browser
Display
GD,
NGD
All
All
All
All
Browser,
In-‐App
Social
NGD
Text,
Image
CPC
All
Behavioral,
geo-‐loca,on.
contextual,
retarge,ng
Browser,
In-‐App
Email
GD,
NGD
Text,
Image
CPM,
CPL
All
Geo-‐loca,on,
behavioral,
retarge,ng
Email
App
30. Example
Ad
System:
Video
Exchange
3d
party
Data
is
used
To
Iden.fy
a
User
and
Matches
It
to
Adver.ser
Demand
via
Impression
Level
Bidding
User
visits
pubs
in
an
exchange
auc.on
marketplace
User
clicks
on
video
player
to
play
Video
Exchange
simultaneously
pings
all
twelve
3rd
party
data
partners
to
see
whether
they
have
relevant
demographic
and/or
behavioral
informa.on
matching
the
target
to
available
impressions
across
the
exchange
Exchange
matches
adver.ser
demand
to
qualified
users
The
ad
server
serves
a
relevant
pre-‐roll
to
that
user
in
real
.me.
Match
31. Responding
to
a
Pub
Ad
Call
Exchange/
Network
Publisher
P
Yo!
I
need
an
ad!
No
prob
Home
Slice!
Here’s
a
XML
doc
with
all
the
info
to
execute
the
ad
010011010101
Publisher
ad call
1
2
Exchange/Network
responds
with XML doc
The
XML
file
is
the
recipe
to
execute
the
video
ad!
31
32. Pub
follows
XML
file
recipe
to
execute
ad
Publisher
pagePublisher
P
I
now
have
my
XML
doc
recipe
…
Now
I’ll
follow
the
recipe
to
show
the
ad
1
3rd Party
Video Ad Server
2
Request
for
video ad file
End User
Pre-roll ad plays
&
beacon events
provide metrics
4
3 Video ad file
sent to the
Publisher’s video player
32
33. More
Players,
More
redirec,ons
Adver,sers
use
their
“primary”
ad
server
to
manage
the
campaign
and
then
hand
off
the
ad
calls
to
a
“secondary”
rich
media
ad
server,
finally
pulling
the
ad
from
a
content
delivery
network
as
in
the
diagram
above.
This
type
of
daisy-‐chaining
is
also
quite
common
with
ad
exchanges
that
handle
remnant
inventory,
thus
crea,ng
even
more
redirec,ons.
34. OK,
So
How
Do
We
Scale?
ì What
is
the
Right
Architecture?
ì What
are
the
best
Data
Structures?
ì What
family
of
Algorithms?
35. 35
Impression-‐Processing
Server
Index,
Model
Par..ons
impression
Bid-‐Genera.on
Server
.
.
.
bids,
auc,on
info
Bid-‐Genera.on
Server
Publisher
Data
Scalable
FE
Serving
Architecture
36. 36
Bid-‐Generation
Server
Farm
Bid-‐Genera.on
Server
.
.
.
Bid-‐Genera.on
Server
Bid-‐Genera.on
Server
Bid-‐Genera.on
Server
.
.
.
.
.
.
.
.
.
#columns
=
#par,,ons
=
M
#rows
=
#replicas
=
N
37. 37
Bidding
System
Structure
ì Impression-‐Processing
Server
annotates
the
submieed
impression,
scaeers
the
impression
to
a
set
of
Bid-‐Genera,on
Servers,
gathers
top
bids
from
local
auc,ons,
and
computes
the
overall
top
bids
for
the
impression
by
running
a
global
auc,on
ì Each
Bid-‐Genera,on
Server
works
on
a
par,,on
of
demand
data,
generates
bids
for
a
given
impression
based
on
that
data
par,,on,
conducts
a
local
auc,on
across
those
bids,
and
returns
local
winners
and
the
corresponding
auc,on
info
39. Analyzing
an
Ad
Request
Flow
1. Eligibility
2. Ranking
(Auc,on)
3. Delivery
4. Display
Ad
EXCHANGE
40. Eligibility:
The
Ad
Matching
Problem
ì BE: age ∈ {10,20} & country ∉ {US}
ì S: age=20 & country=FR & gender=F
ì Given an assignment S, find all matching
Boolean expressions (BEs)
41. Background:
Inverted
Indexes
ì Pos,ng
lists
of
occurring
terms
(tokens)
with
list
of
documents:posi,ons
ì Used
to
match
queries
ì Tokens
ì Boolean
operators
ì Search
returns
documents
with
relevance
score
42. Indexing Boolean Expressions
ì E1: A ∈ {1}
ì E2: A ∈ {1} & B ∈ {2} & C ∈ {3,4}
ì S: A=1 & B=2
Key
Pos.ng
List
(A,1)
E1,E2
(B,2)
E2
(C,3)
E2
(C,4)
E2
43. ID
Expression
K
1
age
∈
{3}
∧
state
∈
{NY
}
2
2
age
∈
{3}
∧
gender
∈
{F}
2
3
age
∈
{3}
∧
gender
∈
{M}
∧
state
∉
{CA}
2
4
state
∈
{CA}
∧
gender
∈
{M}
2
5
age
∈
{3,
4}
1
6
state
∉
{CA,NY
}
0
K
Key
and
UB
Pos.ng
List
0
(state,CA),
2.0
(6,
∉,
0)
(state,NY
),
5
(6,
∉,
0)
Z,
0
(6,
∈,
0)
1
(age,
3),
1.0
(5,
∈,
0.1)
(age,
4),
3.0
(5,
∈,
0.5)
2
(state,NY
),
5
(1,
∈,
4.0)
(age,
3),
1.0
(1,
∈,
0.1)
(2,
∈,
0.1)
(3,
∈,
0.2)
(gender,
F),
2
(2,
∈,
0.3)
(state,CA),
2.0
(3,
∉,
0)
(4,
∈,
1.5)
(gender,M),
1.0
(3,
∈,
0.5)
(4,
∈,
0.9)
Figure
1:
A
set
of
conjunc,ons
Figure
2:
Inverted
list
for
Figure
1
43
K-‐Inverted
List
Construction
44. Ranking
Phase
I:
Top-‐K
Selection
ì Search
algorithm
for
DNF/CNF
BEs
with
relevance
ranking
ì The
score
of
a
BE
E
reflects
the
“relevance”
of
E
to
an
assignment
S.
For
example,
a
user
interested
in
running
might
be
more
interested
in
an
adver,sement
on
shoes
than
an
adver,sement
on
flowers
45. Example:
Scoring
ì S=
{age=1,
state=NY,
gender=F}
ì Ws=(1,2,3)
ì Score(BE1)=0.1*1+2*4
=
8.1
ì Score(BE2)=0.5*1+0.3*3
=
1.4
K
Key
and
UB
Pos.ng
List
2
(state,NY
),
5
(1,
∈,
4.0)
(age,
3),
1.0
(1,
∈,
0.1)
(2,
∈,
0.5)
(gender,
F),
2
(2,
∈,
0.3)
ID
Expression
K
1
age
∈
{3}
∧
state
∈
{NY
}
2
2
age
∈
{3}
∧
gender
∈
{F}
2
47. Example:
Ad
Matching
• Assignment [S]:
age=20 &
country=FR &
gender=F
• Boolean
Expression[SF]:
age ∈ {10,20} &
country ∉ {US}
Given an assignment
S, find all matching
Boolean expressions
(SFs)
• Boolean
Expression[DF]:
ad_size ∈
{800x400,200x50}
& type ∉ {flash}
• Assignment [D]:
crtv_tag =sports &
size=800x400 &
type=Flash
Given a Boolean
Expression DF, find all
matching Assignments
(Ds)
Return al matching Ad Units satisfying the
two-way match!!
Opportunity Query =
Supply Attributes (values)^ Demand Filters (BE)
Indexed Ad Units
Demand
Attributes
(values)
Supply
Filters
(BE)
48. Ranking
Phase
II:
Auction
ì Bids
are
computed
as
an
op,miza,on
based
on
objec,ves
subject
to
budget
constraints.
vG
=
X
g
g
vg
action-rate
goal value
goal
49. Predictive
Analytics
and
Models
ì ML
and
CF
techniques
can
be
used
to
compute
ì Weights
for
Relevance
Ranking
ì Assigned
to
BE
clauses
and
assignment
pairs
ì Ac,on-‐Rates
ì
E.g.
Response
predic,on:
what
is
the
probability
of
a
user
comple,ng
an
ad
view,
clicking
or
conver,ng
ì Op,miza,on
ì Delivery:
Availability
and
Pacing
based
on
Budgets
ì Revenue/ROI
based
Op,miza,on
ì Explora,on-‐Exploita,on
is
required
to
“learn”
new
signals.
ì Resul,ng
models
should
be
par,,oned
and
loaded
into
Bidding
Servers
50. 50
The
Business
of
Recommendations
ì Recommenda,ons
impact
your
business
ì Create
campaigns
that
target
certain
audiences,
sec,ons
of
the
applica,on,
geo-‐loca,on,
etc.
ì Use
recommenda,ons
as
a
way
to
do
promo,ons
as
well
as
upsell
and
cross-‐sell
ì Not
all
items-‐ac,ons
are
created
equal
ì Assess
the
value
of
the
goals.
Bidding
agents
will
take
care
of
the
rest
ì Some
items
have
a
limited
life-‐span
(e.g.
window
of
availability).
Be
sure
to
represent
this
as
constraints
or
budgets
51. Summary
Adver.sing
Recommender
Systems
Targe,ng
Constraints
Budget
Availability
Bid
Relevance
Auc,on
Selec,on
Model
Model
52. The
All
Encompassing
Data
Engine
Data
Engine
Search
&
Discovery
Recommenda,ons
Adver,sing
Intelligence
Data
Engine
=
Data
Core
+
Analy,cs
53. The
Crux
of
Metrics
and
Evaluation
Business
• Revenue
• User
Experience
• Product
and
Service
Ra,ng
Systems
• Conversion
Rate
• ROC
Curves
• Precision
• Recall
User
• Relevance
• Enjoyable
• Novelty
• Originality
Intel
Confiden.al
54. Bucket
Testing
and
Offline
Evaluation
Ad
Server
To
be
evaluated
Ad
Server
(Random
Bucket)
Traces
(100%)
Event
Data
(impr,
click,
Conv,
prob)
Replayer
Ad
Calls
HTTP
Response
Join
Final
Data
For
Evalua,on
55. The
Big
Fish:
OTT
Television
• Online
TV
and
Video-‐on-‐Demand
is
here
to
stay
• Star.ng
to
tap
into
tradi.onal
TV/
Cable
adver.sing
Budgets
• Viewership
+
Web
Data
will
power
new
forms
of
Online
Adver.sement
56.
57. References
ì Indexing
Boolean
Expressions
ì Computa,onal
Adver,sing
and
Recommender
Systems
ì A
Market-‐Based
Approach
to
Recommender
Systems
ì ICML’11
Tutorial
on
Machine
Learning
for
Large
Scale
Recommender
Systems