How to build a personalisation platform for 30M users and not go bust in the process

… and not go bust in the process
How to build
a personalization platform
for 30M users

2
Piotr Turek
Senior Big Data Architect
& Team Leader
Michał Żmuda
Big Data Architect

DreamLab – the IT hub of Ringier Axel Springer

Today's journey – next Reasons to Believe
Reasons to Believe Challenges of
the Real World
Our Approach Brave New World
4

Reasons to believe – Personalisation as a Service
20M
• REAL
USERS
10K
• RPS
30K
• EVENTS/
S
Deployment time
Diverse brands applicable
10+
1w
10d
From 5% to 90+% at Onet.pl

Reasons to believe – works like a charm!
Comparison of personalised and manual mobile versions, 10.12.2018 to 10.01. 2019
Users are
ATTRACTED
Become
ACTIVE
Stay
LOYAL
CTR
headline section +43%
PageViews
/UU
average
+12%
Active
Users
for 10+ days/month
+7%

Today's journey – next Challenges of the Real World
Reasons to Believe Challenges of
the Real World
7

8
1-to-1 Personalisation – basic intuition
Nearest Neighbour Search

9
KEEP ME
INFORMED
OR I LEAVE
:<
NEW NEW
Mix with content-based?
Mix with popularity-based?

Cold User
Cold Start
Online model updates?
Mix with popularity-based?
~25% of users are cold!

Exploitation only!
No built-in exploration!
Adding random for fairness
Adding proper exploration?

𝟏 𝟎 𝟏 𝟎 𝟏 𝟎 𝟎 ⋯ 𝟏
𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 ⋯ 𝟏
𝟎 𝟏 𝟎 𝟎 𝟎 𝟎 𝟎 ⋯ 𝟎
𝟎 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎 ⋯ 𝟎
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 𝟏
𝟏 𝟎 𝟎 𝟎 𝟎 𝟎 𝟎 ⋯ 𝟎
12
1-to-1 Personalisation – underlying data structure
U S E R S
I
T
E
M
S
30M
10K
Online updates?
10K RPS?<<100ms latency?

Lorem ipsum dolor sit amet 1313
THE COST OF PERSONALISATION
IS TOO DAMN HIGH

Today's journey – next Our Approach
14
Intro &
Reasons to Believe
Challenges of
the Real World

USER SEGMENTATIONS
REAL-TIME ALGORITHMS

16
REAL-TIME ALGORITHM(s)
Ensure Time-
Sensitivity
Address
Cold Start
Low
Complexity
CLEAN, FITTING
approach is needed
1-to-1 PERSONALISATION
alone isn’t a solution
INSTANT
EXPLORATION
IMMEDIATE
FEEDBACK
CONTINUOUS
OPTIMIZATION

Real-Time Algorithms – Multi-armed Bandit
EXPLOITATION EXPLORATION
Popular, proven items New items & trends
Predictable payout High reward chance
• Tunable stats
horizon
Harnessing
trends
• Tunable exploration
No filter
bubble
• Config A/B testing
• Hyperparam opt.*
Continuous
improvement
I'm gonna balance
those odds, boy!

Real-Time Data Flow – Overall Architecture
Feedback
via EVENTS
Collect
STREAMS
Compute
MEASURES
Update
OLAP
CUBE
Calculate
Item KPIs
New
RECCOMEN
- DATIONS
Feedback
via EVENTS
18
Sub-second
latency
Cost
effective
Trend
responsive

USER SEGMENTATIONS

20
SEGMENTATION(s)
Caches
reduce costs
Counter
Filter Bubble
Address cold
start
They care about
ATTRACTIVENESS
Users don’t care if
recommendations are
UNIQUE
SPORT CELEBRITIES
SUPERCARS
RALLY
GENERAL SEGMENTATION
MOTO SEGMENTATION
COLD USERS

21
Lower
Costs Greater
Performance
Segmentations – Tunability of Costs
Number of Segments

ARTICLES
EVENTS
TOPIC
MODELING
REAL-TIME
RECOMMENDER
ONLINE
VIEW
CLUSTERING
Segmentations – Overall Architecture

USER SEGMENTATIONS

25
Platforms >> Products
"A product is useless without a platform,
or more precisely, a platform-less
product will always be replaced by an
equivalent platform-ised product"
--- Jeff Bezos

Products
"Vendor knows
best"
Fixed
functionality
Closed system
Platforms
"Customer
knows best"
Customizable
Open to change
Open to
extension
26

27
ANY SEGMENTATION ANY ALGORITHM ANY KPI
One Simple Integration

28
Context
Mapping
Bounded
Contexts =>
Microservices
Build around
APIs
Pluginable
logic
Streaming-
first
OLAP
databases
DomainDrivenDesign
ModernDataArchitecture

Today's journey – next Brave New World
29
Intro &
Reasons to Believe
Challenges of
the Real World

Great success. Are we finished then?

Personalisation will transform how your business operates

32
What
to write/
publish
about?
Which
topics need
attention?
Which
articles no
longer
perform?

33
Too many reports, too little time
N
(10+)
sections per
page
K
(10+)
versions per
section
N*K
(100+)
section versions
to manage

34
Too many reports, too little time
N
(10+)
versions per
section
K
(10+)
sections per
page
N*K
(100+)
section versions
to manage

Prescriptive!
• Suggest sensible actions
Predictive
• Predict outcomes
Diagnostic
• Explain why
Descriptive
• Say what happened 35
Analytics reimagined - the paradigm shift

36
What
to write/
publish
about?
Which
topics need
attention?
Which
articles no
longer
perform?

37
How can I
help you?
Hey Ring!

38
Hey! Users interested in
„Game of Thrones, winter, anticlimax"
are less active than they used to be.
Consider writing more about this
Estimated users affected: 3.2M
Hey Ring!
What can I do
better?

Editorial Insights
PlatformStreamingOLAP

PlatformStreamingOLAP
Any advanced,
unforeseen functionality

Platform and vision… Success!

Machine Learning
Project
Business Transformation
Platform-approach
The way to execute
Transformation
Complexity Enemy of Success
KEY TAKEAWAYS!

• linkedin.com/in/zmu-michal
• twitter.com/zmu_michal
• github.com/zmumi
• michal.zmuda@dreamlab.pl
• linkedin.com/in/pturek
• twitter.com/rekurencja
• github.com/turu
• piotr.turek@dreamlab.pl
THANK YOU!

1. https://images.pexels.com/photos/161140/pexels-photo-161140.jpeg?cs=srgb&dl=break-break-free-broken-161140.jpg&fm=jpg
2. http://gijoburgeast.sites.caxton.co.za/wp-content/uploads/sites/100/2015/05/holione-post.jpg
3. https://i.ytimg.com/vi/lXmHA-XySmk/maxresdefault.jpg
4. http://pngimg.com/uploads/thinking_woman/thinking_woman_PNG11618.png
5. https://i1.wp.com/petronelarotar.ro/wp-content/uploads/2017/03/alegerea-de-a-fi-ne-fericit.png?fit=2560%2C1440
6. http://www.stickpng.com/assets/images/584dfc7e6a5ae41a83ddee17.png
7. https://www.oasisofhopeusa.org/oasisofhope/wp-content/uploads/2018/07/break-chains-1014x487.jpg
8. http://www.quickmeme.com/img/ab/ab3a65c0eeaddb9da151d559bc45ce2f8f414366cd27f9aa289591833be68f72.jpg
9. https://i.ytimg.com/vi/4wcAJH0XAz8/maxresdefault.jpg
10. https://live.staticflickr.com/5211/5497134432_9c680ecc8f_b.jpg
11. https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/Pillar_ionic.svg/1024px-Pillar_ionic.svg.png
12. https://svgsilh.com/svg/48780.svg
13. https://uncertainitiesgalore.files.wordpress.com/2015/03/expat-relationships-life-woman-in-a-bubble-with-a-man-looking-
in-from-outside.jpg
Photo credits
44

How to build a personalisation platform for 30M users and not go bust in the process

Recomendados

Recomendados

Más contenido relacionado

Último

Último (20)

Destacado

Destacado (20)

How to build a personalisation platform for 30M users and not go bust in the process

Notas del editor