Más contenido relacionado
Similar a Devoxx Real-time Learning (15)
Devoxx Real-time Learning
- 2. whoami – Ted Dunning
Chief Application Architect, MapR Technologies
Committer, member, Apache Software Foundation
– particularly Mahout, Zookeeper and Drill
(we’re hiring)
Contact me at
tdunning@maprtech.com
tdunning@apache.com
ted.dunning@gmail.com
@ted_dunning
©MapR Technologies - Confidential 2
- 3. Slides and such (available late tonight):
– http://www.mapr.com/company/events/devoxx-3-29-2013
Hash tags: #mapr #devoxxfr
©MapR Technologies - Confidential 3
- 4. Agenda
What is real-time learning?
A sample problem
Philosophy, statistics and the nature of the knowledge
A solution
System design
©MapR Technologies - Confidential 4
- 5. What is Real-time Learning?
Training data arrives one record at a time
The system improves a mathematical model based on a small
amount of training data
We retain at most a fixed amount of state
Each learning step takes O(1) time and memory
©MapR Technologies - Confidential 5
- 6. We have a product
to sell …
from a web-site
©MapR Technologies - Confidential 6
- 7. What tag-
What line?
picture?
Bogus Dog Food is the Best!
Now available in handy 1 ton
bags!
Buy 5!
What call to
action?
©MapR Technologies - Confidential 7
- 8. The Challenge
Design decisions affect probability of success
– Cheesy web-sites don’t even sell cheese
The best designers do better when allowed to fail
– Exploration juices creativity
But failing is expensive
– If only because we could have succeeded
– But also because offending or disappointing customers is bad
©MapR Technologies - Confidential 8
- 9. A Quick Diversion
You see a coin
– What is the probability of heads?
– Could it be larger or smaller than that?
I flip the coin and while it is in the air ask again
I catch the coin and ask again
I look at the coin (and you don’t) and ask again
Why does the answer change?
– And did it ever have a single value?
©MapR Technologies - Confidential 9
- 10. A Philosophical Conclusion
Probability as expressed by humans is subjective and depends on
information and experience
©MapR Technologies - Confidential 10
- 11. So now you understand
Bayesian probability
©MapR Technologies - Confidential 11
- 12. Another Quick Diversion
Let’s play a shell game
This is a special shell game
It costs you nothing to play
The pea has constant probability of being under each shell
(trust me)
How do you find the best shell?
How do you find it while maximizing the number of wins?
©MapR Technologies - Confidential 12
- 14. Conclusions
Can you identify winners or losers without trying them out?
No
Can you ever completely eliminate a shell with a bad streak?
No
Should you keep trying apparent losers?
Yes, but at a decreasing rate
©MapR Technologies - Confidential 14
- 15. So now you understand
multi-armed bandits
©MapR Technologies - Confidential 15
- 16. Is there an optimum
strategy?
©MapR Technologies - Confidential 16
- 17. Thompson Sampling
Select each shell according to the probability that it is the best
Probability that it is the best can be computed using posterior
é ù
P(i is best) = ò I êE[ri | q ] = max E[rj | q ]ú P(q | D) dq
ë j û
But I promised a simple answer
©MapR Technologies - Confidential 17
- 18. Thompson Sampling – Take 2
Sample θ
q ~ P(q | D)
Pick i to maximize reward
i = argmax E[r | q ]
j
Record result from using i
©MapR Technologies - Confidential 18
- 20. Bayesian Bandit for the Shells
Compute distributions based on data so far
Sample p1, p2 and p3 from these distributions
Pick shell i where i = argmaxi pi
Lemma 1: The probability of picking shell i will match the
probability it is the best shell
Lemma 2: This is as good as it gets
©MapR Technologies - Confidential 20
- 21. And it works!
0.12
0.11
0.1
0.09
0.08
0.07
regret
0.06
ε- greedy, ε = 0.05
0.05
0.04 Bayesian Bandit with Gam m a- Norm al
0.03
0.02
0.01
0
0 100 200 300 400 500 600 700 800 900 1000 1100
n
©MapR Technologies - Confidential 21
- 23. The Basic Idea
We can encode a distribution by sampling
Sampling allows unification of exploration and exploitation
Can be extended to more general response models
©MapR Technologies - Confidential 23
- 24. The Original Problem
x2
x1
Bogus Dog Food is the Best!
Now available in handy 1 ton
bags!
Buy 5!
x3
©MapR Technologies - Confidential 24
- 25. Mathematical Statement
Logistic or probit regression
P(conversion) = w (å x q )
i ij
1
w(x) =
1+ e- x
erf(x) +1
w(x) =
2
©MapR Technologies - Confidential 25
- 26. Same Algorithm
Sample θ
q ~ P(q | D)
Pick design x to maximize reward
x* = argmax E[rx | q ] = argmax å xiqij
x x
©MapR Technologies - Confidential 26
- 27. Context Variables
x2
x1
Bogus Dog Food is the Best!
Now available in handy 1 ton
bags!
Buy 5!
x3
y1=user.geo y2=env.time y3=env.day_of_week y4=env.weekend
©MapR Technologies - Confidential 27
- 28. Two Kinds of Variables
The web-site design - x1, x2, x3
– We can change these
– Different values give different web-site designs
The environment or context – y1, y2, y3, y4
– We can’t change these
– They can change themselves
Our model should include interactions between x and y
©MapR Technologies - Confidential 28
- 29. Same Algorithm, More Greek Letters
Sample θ, π, φ
(q, P, F)~ P(q, P, F | D)
Pick design x to maximize reward, y’s are constant
x* = argmax E[rx | q ]
x
= argmax å xiqi + å xi y j p ij + å yij i
x i i, j i
This looks very fancy, but is actually pretty simple
©MapR Technologies - Confidential 29
- 30. Surprises
We cannot record a non-conversion until we wait
We cannot record a conversion until we wait for the same time
Learning from conversions requires delay
We don’t have to wait very long
©MapR Technologies - Confidential 30
- 35. Required Steps
Learn distribution of parameters from data
– Logistic regression or probit regression (can be on-line!)
– Need Bayesian learning algorithm
Sample from posterior distribution
– Generally included in Bayesian learning algorithm
Pick design
– Simple sequential search
Record data
©MapR Technologies - Confidential 35
- 37. Hadoop is Not Very Real-time
Unprocessed now
Data
t
Fully Latest full Hadoop job
processed period takes this
long for this
data
©MapR Technologies - Confidential 37
- 38. Real-time and Long-time together
Blended now
View
view
t
Hadoop works Storm
great back here works
here
©MapR Technologies - Confidential 38
- 39. Traditional Hadoop Design
Can use Kafka cluster to queue log lines
Can use Storm cluster to do real time learning
Can host web site on NAS
Can use Flume cluster to import data from Kafka to Hadoop
Can record long-term history on Hadoop Cluster
How many clusters?
©MapR Technologies - Confidential 39
- 40. HDFS
Data
Flume
Hadoop
Users
Kafka
Kafka
Kafka
Cluster
Cluster Kafka
Cluster API
Storm
Kafka
Web Site
Design
Targeting
Web Service NAS
©MapR Technologies - Confidential 40
- 41. That is a lot of
moving parts!
©MapR Technologies - Confidential 41
- 42. Alternative Design
Can host log catcher on MapR via NFS
Storm can read data directly from queue
Can host web server directly on cluster
Only one cluster needed
– Total instances drops by 3x
– Admin burden massively decreased
©MapR Technologies - Confidential 42
- 43. Users
http
Web-server
Catcher Storm
Topic Web
Queue Data
MapR
©MapR Technologies - Confidential 43
- 44. You can do this
yourself!
©MapR Technologies - Confidential 44
- 45. Contact Me!
We’re hiring at MapR in US and Europe
MapR software available for research use
Contact me at tdunning@maprtech.com or @ted_dunning
Share news with @apachemahout
Tweet #devoxxfr #mapr #mahout @ted_dunning
©MapR Technologies - Confidential 45