Artworks personalization on Netflix

7. Intuition for Personalized Assets ● Emphasize themes through different artwork according to some context (user, viewing history, country, etc.) Preferences in genre

8. Intuition for Personalized Assets ● Emphasize themes through different artwork according to some context (user, viewing history, country, etc) Preferences in cast members

9. Bandit Algorithms Setting For each (user, show) request: ● Actions: set of candidate images available ● Reward: how many minutes did the user play from that impression ● Environment: Netflix homepage in user’s device ● Learner: its goal is to maximize the cumulative reward after N requests Learner Environment Action Reward Context

10. Numerous Variants ● Different Strategies: ε-Greedy, Thompson Sampling (TS), Upper Confidence Bound (UCB), etc. ● Different Environments: ○ Stochastic and stationary: Reward is generated i.i.d. from a distribution specific to the action. No payoff drift. ○ Adversarial: No assumptions on how rewards are generated. ● Different objectives: Cumulative regret, tracking the best expert ● Continuous or discrete set of actions, finite vs infinite ● Extensions: Varying set of arms, Contextual Bandits, etc.

11. Specific challenges ● Play attribution and reward assignment ○ Incremental effect of the image on top of recommender system ● Only one image per title can be presented ○ Although inherently it is a ranking problem Would you play because the movie is recommended or because of the artwork? Or both?

12. Specific challenges ● Change effect ○ Can changing images too often make users confused? Session 1 Session 2 Session 3 ... Session N Sequence A Sequence B

13. ● We have control over the set of actions ○ How many images per show ○ Image design ● What makes a good asset? ○ Representative (no clickbait) ○ Differential ○ Informative ○ Engaging Actions Personal (i.e. contextual)

14. Explore show? Choose Epsilon Greedy Example εprofile 1-εprofile εshow 1-εshow Personalized Image Image At Random

15. ● Learn a binary classifier per image to predict probability of play ● Pick the winner (arg max) Member (context) Features Image Pool Model 1 Winner arg max Model 2 Model 3 Model 4 Greedy Policy Example

16. Take Fraction Example: Luke Cage Take Fraction = 1 / 3 Play No play User A User B User C

17. ● Unbiased offline evaluation from explore data Offline metric: Replay [Li et al, 2010] Offline Take Fraction = 2 / 3 User 1 User 2 User 3 User 4 User 5 User 6 Random Assignment Play? Model Assignment

18. Offline Replay ● Context matters ● Artwork diversity matters ● Personalization wiggles around most popular images Lift in Replay in the various algorithms as compared to the Random baseline

19. Online results ● Rollout to our >130M member base ● Most beneficial for lesser known titles ● Compression from title -level offline metrics due to cannibalization between titles

20. Research Directions

21. Action selection orchestration ● Neighboring image selection influences result ● Title-level optimization is not enough Row A (diverse images) Row B (the microphone row) Stand-up comedy

22. Automatic image selection ● Generating new artwork is costly and time consuming ● Develop algorithm to predict asset quality from raw image

23. Long-term Reward: Road to RL ● Maximize long term reward: reinforcement learning ○ User long term joy rather than plays

24. Thank you. Fernando Amat (famat@netflix.com) Blogpost We are hiring!

Artworks personalization on Netflix

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Artworks personalization on Netflix

Similar a Artworks personalization on Netflix (20)

Más de IntoTheMinds

Más de IntoTheMinds (20)

Último

Último (20)

Artworks personalization on Netflix