SlideShare una empresa de Scribd logo
1 de 99
Descargar para leer sin conexión
Interactive machine learning
Daniel Hsu
Columbia University
1
Non-interactive machine learning
“Cartoon” of (non-interactive) machine learning
2
Non-interactive machine learning
“Cartoon” of (non-interactive) machine learning
1. Get labeled data {(inputi , outputi )}n
i=1.
2
Non-interactive machine learning
“Cartoon” of (non-interactive) machine learning
1. Get labeled data {(inputi , outputi )}n
i=1.
2. Learn prediction function ˆf (e.g., classifier, regressor, policy)
such that
ˆf (inputi ) ≈ outputi
for most i = 1, 2, . . . , n.
2
Non-interactive machine learning
“Cartoon” of (non-interactive) machine learning
1. Get labeled data {(inputi , outputi )}n
i=1.
2. Learn prediction function ˆf (e.g., classifier, regressor, policy)
such that
ˆf (inputi ) ≈ outputi
for most i = 1, 2, . . . , n.
Goal: ˆf (input) ≈ output for future (input, output) pairs.
2
Non-interactive machine learning
“Cartoon” of (non-interactive) machine learning
1. Get labeled data {(inputi , outputi )}n
i=1.
2. Learn prediction function ˆf (e.g., classifier, regressor, policy)
such that
ˆf (inputi ) ≈ outputi
for most i = 1, 2, . . . , n.
Goal: ˆf (input) ≈ output for future (input, output) pairs.
Some applications: document classification, face detection,
speech recognition, machine translation, credit rating, . . .
2
Non-interactive machine learning
“Cartoon” of (non-interactive) machine learning
1. Get labeled data {(inputi , outputi )}n
i=1.
2. Learn prediction function ˆf (e.g., classifier, regressor, policy)
such that
ˆf (inputi ) ≈ outputi
for most i = 1, 2, . . . , n.
Goal: ˆf (input) ≈ output for future (input, output) pairs.
Some applications: document classification, face detection,
speech recognition, machine translation, credit rating, . . .
There is a lot of technology for doing this,
and a lot of mathematical theories for understanding this.
2
Interactive machine learning: example #1
Practicing physician
3
Interactive machine learning: example #1
Practicing physician
Loop:
1. Patient arrives with symptoms, medical history, genome . . .
3
Interactive machine learning: example #1
Practicing physician
Loop:
1. Patient arrives with symptoms, medical history, genome . . .
2. Prescribe treatment.
3
Interactive machine learning: example #1
Practicing physician
Loop:
1. Patient arrives with symptoms, medical history, genome . . .
2. Prescribe treatment.
3. Observe impact on patient’s health (e.g., improves, worsens).
3
Interactive machine learning: example #1
Practicing physician
Loop:
1. Patient arrives with symptoms, medical history, genome . . .
2. Prescribe treatment.
3. Observe impact on patient’s health (e.g., improves, worsens).
Goal prescribe treatments that yield good health outcomes.
3
Interactive machine learning: example #2
Website operator
4
Interactive machine learning: example #2
Website operator
Loop:
1. User visits website with profile, browsing history . . .
4
Interactive machine learning: example #2
Website operator
Loop:
1. User visits website with profile, browsing history . . .
2. Choose content to display on website.
4
Interactive machine learning: example #2
Website operator
Loop:
1. User visits website with profile, browsing history . . .
2. Choose content to display on website.
3. Observe user reaction to content (e.g., click, “like”).
4
Interactive machine learning: example #2
Website operator
Loop:
1. User visits website with profile, browsing history . . .
2. Choose content to display on website.
3. Observe user reaction to content (e.g., click, “like”).
Goal choose content that yield desired user behavior.
4
Interactive machine learning: example #3
E-mail service provider
5
Interactive machine learning: example #3
E-mail service provider
Loop:
1. Receive e-mail messages for users (spam or not).
5
Interactive machine learning: example #3
E-mail service provider
Loop:
1. Receive e-mail messages for users (spam or not).
2. Ask users to provide labels for some borderline cases.
5
Interactive machine learning: example #3
E-mail service provider
Loop:
1. Receive e-mail messages for users (spam or not).
2. Ask users to provide labels for some borderline cases.
3. Improve spam filter using newly labeled messages.
5
Interactive machine learning: example #3
E-mail service provider
Loop:
1. Receive e-mail messages for users (spam or not).
2. Ask users to provide labels for some borderline cases.
3. Improve spam filter using newly labeled messages.
Goal maximize accuracy of spam filter,
minimize number of queries to users.
5
Characteristics of interactive machine learning problems
6
Characteristics of interactive machine learning problems
1. Learning agent (a.k.a. “learner”) interacts with the world
(e.g., patients, users) to gather data.
6
Characteristics of interactive machine learning problems
1. Learning agent (a.k.a. “learner”) interacts with the world
(e.g., patients, users) to gather data.
2. Learner’s performance based on learner’s decisions.
6
Characteristics of interactive machine learning problems
1. Learning agent (a.k.a. “learner”) interacts with the world
(e.g., patients, users) to gather data.
2. Learner’s performance based on learner’s decisions.
3. Data available to learner depends on learner’s decisions.
6
Characteristics of interactive machine learning problems
1. Learning agent (a.k.a. “learner”) interacts with the world
(e.g., patients, users) to gather data.
2. Learner’s performance based on learner’s decisions.
3. Data available to learner depends on learner’s decisions.
4. State of the world depends on learner’s decisions.
6
Characteristics of interactive machine learning problems
1. Learning agent (a.k.a. “learner”) interacts with the world
(e.g., patients, users) to gather data.
2. Learner’s performance based on learner’s decisions.
3. Data available to learner depends on learner’s decisions.
4. State of the world depends on learner’s decisions.
This talk: two interactive machine learning problems,
1. active learning, and
2. contextual bandit learning.
(Our models for these problems have all but the last of above characteristics.)
6
Active learning
Motivation
Recall non-interactive (a.k.a. passive) machine learning:
1. Get labeled data {(inputi , labeli )}n
i=1.
2. Learn function ˆf (e.g., classifier, regressor, decision rule, policy)
such that
ˆf (inputi ) ≈ labeli
for most i = 1, 2, . . . , n.
8
Motivation
Recall non-interactive (a.k.a. passive) machine learning:
1. Get labeled data {(inputi , labeli )}n
i=1.
2. Learn function ˆf (e.g., classifier, regressor, decision rule, policy)
such that
ˆf (inputi ) ≈ labeli
for most i = 1, 2, . . . , n.
Problem: labels often much more expensive to get than inputs.
E.g., can’t ask for spam/not-spam label of every e-mail.
8
Active learning
Basic structure of active learning:
9
Active learning
Basic structure of active learning:
Start with pool of unlabeled inputs
(and maybe some (input, label) pairs).
9
Active learning
Basic structure of active learning:
Start with pool of unlabeled inputs
(and maybe some (input, label) pairs).
Repeat:
1. Train prediction function ˆf using current labeled pairs.
9
Active learning
Basic structure of active learning:
Start with pool of unlabeled inputs
(and maybe some (input, label) pairs).
Repeat:
1. Train prediction function ˆf using current labeled pairs.
2. Pick some inputs from the pool, and query their labels.
9
Active learning
Basic structure of active learning:
Start with pool of unlabeled inputs
(and maybe some (input, label) pairs).
Repeat:
1. Train prediction function ˆf using current labeled pairs.
2. Pick some inputs from the pool, and query their labels.
Goal:
9
Active learning
Basic structure of active learning:
Start with pool of unlabeled inputs
(and maybe some (input, label) pairs).
Repeat:
1. Train prediction function ˆf using current labeled pairs.
2. Pick some inputs from the pool, and query their labels.
Goal:
Final prediction function ˆf satisfies ˆf (input) ≈ label for
most future (input, label) pairs.
9
Active learning
Basic structure of active learning:
Start with pool of unlabeled inputs
(and maybe some (input, label) pairs).
Repeat:
1. Train prediction function ˆf using current labeled pairs.
2. Pick some inputs from the pool, and query their labels.
Goal:
Final prediction function ˆf satisfies ˆf (input) ≈ label for
most future (input, label) pairs.
Number of label queries is small.
(Can be selective/adaptive about which labels to query . . . )
9
Sampling bias
Main difficulty of active learning: sampling bias.
10
Sampling bias
Main difficulty of active learning: sampling bias.
At any point in active learning process, labeled data set
typically not representative of overall data.
10
Sampling bias
Main difficulty of active learning: sampling bias.
At any point in active learning process, labeled data set
typically not representative of overall data.
Problem can derail learning and go unnoticed!
10
Sampling bias
Main difficulty of active learning: sampling bias.
At any point in active learning process, labeled data set
typically not representative of overall data.
Problem can derail learning and go unnoticed!
input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.
45%45% 5%5%
10
Sampling bias
Typical active learning heuristic:
Start with pool of unlabeled inputs.
input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.
45%45% 5%5%
10
Sampling bias
Typical active learning heuristic:
Start with pool of unlabeled inputs.
Pick a handful of inputs, and query their labels.
input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.
45%45% 5%5%
10
Sampling bias
Typical active learning heuristic:
Start with pool of unlabeled inputs.
Pick a handful of inputs, and query their labels.
Repeat:
1. Train classifier ˆf using current labeled pairs.
input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.
45%45% 5%5%
10
Sampling bias
Typical active learning heuristic:
Start with pool of unlabeled inputs.
Pick a handful of inputs, and query their labels.
Repeat:
1. Train classifier ˆf using current labeled pairs.
2. Pick input from pool closest to decision boundary of ˆf ,
and query its label.
input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.
45%45% 5%5%
10
Sampling bias
Typical active learning heuristic:
Start with pool of unlabeled inputs.
Pick a handful of inputs, and query their labels.
Repeat:
1. Train classifier ˆf using current labeled pairs.
2. Pick input from pool closest to decision boundary of ˆf ,
and query its label.
input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.
45%45% 5%5%
Learner converges to classifier with 5% error rate, even though
there’s a classifier with 2.5% error rate. Not statistically consistent!
10
Framework for statistically consistent active learning
Importance weighted active learning
(Beygelzimer, Dasgupta, and Langford, ICML 2009)
11
Framework for statistically consistent active learning
Importance weighted active learning
(Beygelzimer, Dasgupta, and Langford, ICML 2009)
Start with unlabeled pool {inputi }n
i=1.
11
Framework for statistically consistent active learning
Importance weighted active learning
(Beygelzimer, Dasgupta, and Langford, ICML 2009)
Start with unlabeled pool {inputi }n
i=1.
For each i = 1, 2, . . . , n (in random order):
1. Get inputi .
11
Framework for statistically consistent active learning
Importance weighted active learning
(Beygelzimer, Dasgupta, and Langford, ICML 2009)
Start with unlabeled pool {inputi }n
i=1.
For each i = 1, 2, . . . , n (in random order):
1. Get inputi .
2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .
11
Framework for statistically consistent active learning
Importance weighted active learning
(Beygelzimer, Dasgupta, and Langford, ICML 2009)
Start with unlabeled pool {inputi }n
i=1.
For each i = 1, 2, . . . , n (in random order):
1. Get inputi .
2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .
3. If heads, query labeli , and include (inputi , labeli ) with
importance weight 1/pi in data set.
11
Framework for statistically consistent active learning
Importance weighted active learning
(Beygelzimer, Dasgupta, and Langford, ICML 2009)
Start with unlabeled pool {inputi }n
i=1.
For each i = 1, 2, . . . , n (in random order):
1. Get inputi .
2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .
3. If heads, query labeli , and include (inputi , labeli ) with
importance weight 1/pi in data set.
Importance weight = “inverse propensity” that labeli is queried.
Used to account for sampling bias in error rate estimates.
11
Importance weighted active learning algorithms
How to choose pi ?
12
Importance weighted active learning algorithms
How to choose pi ?
1. Use your favorite heuristic (as long as pi is never too small).
12
Importance weighted active learning algorithms
How to choose pi ?
1. Use your favorite heuristic (as long as pi is never too small).
2. Simple rule based on estimated error rate differences
(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).
12
Importance weighted active learning algorithms
How to choose pi ?
1. Use your favorite heuristic (as long as pi is never too small).
2. Simple rule based on estimated error rate differences
(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).
In many cases, can prove the following:
Final classifier is as accurate as learner that queries all n labels.
Expected number of queries
n
i=1 pi grows sub-linearly with n.
12
Importance weighted active learning algorithms
How to choose pi ?
1. Use your favorite heuristic (as long as pi is never too small).
2. Simple rule based on estimated error rate differences
(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).
In many cases, can prove the following:
Final classifier is as accurate as learner that queries all n labels.
Expected number of queries
n
i=1 pi grows sub-linearly with n.
3. Solve a large convex optimization problem
(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).
12
Importance weighted active learning algorithms
How to choose pi ?
1. Use your favorite heuristic (as long as pi is never too small).
2. Simple rule based on estimated error rate differences
(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).
In many cases, can prove the following:
Final classifier is as accurate as learner that queries all n labels.
Expected number of queries
n
i=1 pi grows sub-linearly with n.
3. Solve a large convex optimization problem
(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).
Even stronger theoretical guarantees.
12
Importance weighted active learning algorithms
How to choose pi ?
1. Use your favorite heuristic (as long as pi is never too small).
2. Simple rule based on estimated error rate differences
(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).
In many cases, can prove the following:
Final classifier is as accurate as learner that queries all n labels.
Expected number of queries
n
i=1 pi grows sub-linearly with n.
3. Solve a large convex optimization problem
(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).
Even stronger theoretical guarantees.
Latter two methods can be made to work with any classifier type,
provided a good (non-interactive) learning algorithm.
12
Contextual bandit learning
Contextual bandit learning
14
Contextual bandit learning
Protocol for contextual bandit learning:
For round t = 1, 2, . . . , T:
14
Contextual bandit learning
Protocol for contextual bandit learning:
For round t = 1, 2, . . . , T:
1. Observe contextt. [e.g., user profile]
14
Contextual bandit learning
Protocol for contextual bandit learning:
For round t = 1, 2, . . . , T:
1. Observe contextt. [e.g., user profile]
2. Choose actiont ∈ Actions. [e.g., display ad]
14
Contextual bandit learning
Protocol for contextual bandit learning:
For round t = 1, 2, . . . , T:
1. Observe contextt. [e.g., user profile]
2. Choose actiont ∈ Actions. [e.g., display ad]
3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]
14
Contextual bandit learning
Protocol for contextual bandit learning:
For round t = 1, 2, . . . , T:
1. Observe contextt. [e.g., user profile]
2. Choose actiont ∈ Actions. [e.g., display ad]
3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]
Goal: Choose actions that yield high reward.
14
Contextual bandit learning
Protocol for contextual bandit learning:
For round t = 1, 2, . . . , T:
1. Observe contextt. [e.g., user profile]
2. Choose actiont ∈ Actions. [e.g., display ad]
3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]
Goal: Choose actions that yield high reward.
Note: In round t, only observe reward for chosen actiont, and not
reward that you would’ve received if you’d chosen a different action.
14
Challenges in contextual bandit learning
15
Challenges in contextual bandit learning
1. Explore vs. exploit:
Should use what you’ve already learned about actions.
Must also learn about new actions that could be good.
15
Challenges in contextual bandit learning
1. Explore vs. exploit:
Should use what you’ve already learned about actions.
Must also learn about new actions that could be good.
2. Must use context effectively.
Which action is good likely to depend on current context.
Want to do as well as the best policy (i.e., decision rule)
π: context → action
from some rich class of policies Π.
15
Challenges in contextual bandit learning
1. Explore vs. exploit:
Should use what you’ve already learned about actions.
Must also learn about new actions that could be good.
2. Must use context effectively.
Which action is good likely to depend on current context.
Want to do as well as the best policy (i.e., decision rule)
π: context → action
from some rich class of policies Π.
3. Selection bias, especially when exploiting.
15
Why is exploration necessary?
No-exploration approach:
16
Why is exploration necessary?
No-exploration approach:
1. Using historical data, learn a “reward predictor” for each action
based on context:
reward(action | context) .
16
Why is exploration necessary?
No-exploration approach:
1. Using historical data, learn a “reward predictor” for each action
based on context:
reward(action | context) .
2. Then deploy policy ˆπ, given by
ˆπ(context) := arg max
action∈Actions
reward(action | context) ,
and collect more data.
16
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.5 0.1
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.5 0.1
New policy: ˆπ (x1) = ˆπ (x2) = a1.
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.5 0.1
New policy: ˆπ (x1) = ˆπ (x2) = a1.
Observed rewards
a1 a2
x1 0.7 —
x2 0.3 0.1
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.5 0.1
New policy: ˆπ (x1) = ˆπ (x2) = a1.
Observed rewards
a1 a2
x1 0.7 —
x2 0.3 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.3 0.1
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.5 0.1
New policy: ˆπ (x1) = ˆπ (x2) = a1.
Observed rewards
a1 a2
x1 0.7 —
x2 0.3 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.3 0.1
Never try a2 in context x1.
17
Using no-exploration
Example: two actions {a1, a2}, two contexts {x1, x2}.
Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2.
Observed rewards
a1 a2
x1 0.7 —
x2 — 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.5 0.1
New policy: ˆπ (x1) = ˆπ (x2) = a1.
Observed rewards
a1 a2
x1 0.7 —
x2 0.3 0.1
Reward estimates
a1 a2
x1 0.7 0.5
x2 0.3 0.1
True rewards
a1 a2
x1 0.7 1.0
x2 0.3 0.1
Never try a2 in context x1. Highly sub-optimal.
17
Framework for simultaneous exploration and exploitation
18
Framework for simultaneous exploration and exploitation
Initialize distribution W over some policies.
For t = 1, 2, . . . , T:
18
Framework for simultaneous exploration and exploitation
Initialize distribution W over some policies.
For t = 1, 2, . . . , T:
1. Observe contextt.
18
Framework for simultaneous exploration and exploitation
Initialize distribution W over some policies.
For t = 1, 2, . . . , T:
1. Observe contextt.
2. Randomly pick policy ˆπ ∼ W ,
choose ˆπ(contextt) ∈ Actions.
18
Framework for simultaneous exploration and exploitation
Initialize distribution W over some policies.
For t = 1, 2, . . . , T:
1. Observe contextt.
2. Randomly pick policy ˆπ ∼ W ,
choose ˆπ(contextt) ∈ Actions.
3. Collect rewardt(actiont) ∈ [0, 1],
update distribution W .
18
Framework for simultaneous exploration and exploitation
Initialize distribution W over some policies.
For t = 1, 2, . . . , T:
1. Observe contextt.
2. Randomly pick policy ˆπ ∼ W ,
choose ˆπ(contextt) ∈ Actions.
3. Collect rewardt(actiont) ∈ [0, 1],
update distribution W .
Can use “inverse propensity” importance weights to account for
sampling bias when estimating expected rewards of policies.
18
Algorithms for contextual bandit learning
How to choose the distribution W over policies?
19
Algorithms for contextual bandit learning
How to choose the distribution W over policies?
Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)
Maintains exponential weights over all policies in a
policy class Π.
19
Algorithms for contextual bandit learning
How to choose the distribution W over policies?
Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)
Maintains exponential weights over all policies in a
policy class Π.
Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, and
Zhang, UAI 2011)
Solves optimization problem to come up with
poly(T, log |Π|)-sparse weights over policies.
19
Algorithms for contextual bandit learning
How to choose the distribution W over policies?
Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)
Maintains exponential weights over all policies in a
policy class Π.
Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, and
Zhang, UAI 2011)
Solves optimization problem to come up with
poly(T, log |Π|)-sparse weights over policies.
Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)
Simpler and faster algorithm to solve optimization
problem; get much sparser weights over policies.
19
Algorithms for contextual bandit learning
How to choose the distribution W over policies?
Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)
Maintains exponential weights over all policies in a
policy class Π.
Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, and
Zhang, UAI 2011)
Solves optimization problem to come up with
poly(T, log |Π|)-sparse weights over policies.
Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)
Simpler and faster algorithm to solve optimization
problem; get much sparser weights over policies.
Latter two methods rely on a good (non-interactive) learning
algorithm for policy class Π.
19
Wrap-up
20
Wrap-up
Interactive machine learning models better capture how
machine learning is used in many applications (than
non-interactive frameworks).
20
Wrap-up
Interactive machine learning models better capture how
machine learning is used in many applications (than
non-interactive frameworks).
Sampling bias is a pervasive issue.
20
Wrap-up
Interactive machine learning models better capture how
machine learning is used in many applications (than
non-interactive frameworks).
Sampling bias is a pervasive issue.
Research: how to use non-interactive machine learning
technology in these new interactive settings.
20
Wrap-up
Interactive machine learning models better capture how
machine learning is used in many applications (than
non-interactive frameworks).
Sampling bias is a pervasive issue.
Research: how to use non-interactive machine learning
technology in these new interactive settings.
Thanks!
20

Más contenido relacionado

Destacado

The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleKrishna Sankar
 
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Matthew Russell
 
Building Tooling And Culture Together
Building Tooling And Culture TogetherBuilding Tooling And Culture Together
Building Tooling And Culture TogetherNishan Subedi
 
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...Rizwan Habib
 
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...Rizwan Habib
 
NYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason YosinskiNYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason YosinskiRizwan Habib
 
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...Rizwan Habib
 
NYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeNYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeRizwan Habib
 
NYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFeeNYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFeeRizwan Habib
 
Virtual Madness @ Etsy
Virtual Madness @ EtsyVirtual Madness @ Etsy
Virtual Madness @ EtsyNishan Subedi
 
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
NYAI - Commodity Machine Learning & Beyond by Andreas MuellerNYAI - Commodity Machine Learning & Beyond by Andreas Mueller
NYAI - Commodity Machine Learning & Beyond by Andreas MuellerRizwan Habib
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learnodsc
 
Mining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started GuideMining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started GuideMatthew Russell
 
Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebMatthew Russell
 
Lessons Learned from Running Hundreds of Kaggle Competitions
Lessons Learned from Running Hundreds of Kaggle CompetitionsLessons Learned from Running Hundreds of Kaggle Competitions
Lessons Learned from Running Hundreds of Kaggle CompetitionsBen Hamner
 
What convnets look at when they look at nudity
What convnets look at when they look at nudityWhat convnets look at when they look at nudity
What convnets look at when they look at nudityRyan Compton
 
KAMAS Health 2.0 Presentation
KAMAS Health 2.0 PresentationKAMAS Health 2.0 Presentation
KAMAS Health 2.0 Presentationatduskgreg
 
Toefl speaking test preparation Lesson 9 (t)
Toefl speaking test preparation Lesson 9 (t)Toefl speaking test preparation Lesson 9 (t)
Toefl speaking test preparation Lesson 9 (t)SkimaTalk
 
DESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.in
DESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.inDESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.in
DESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.inchrist university
 

Destacado (20)

The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to Kaggle
 
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
 
Building Tooling And Culture Together
Building Tooling And Culture TogetherBuilding Tooling And Culture Together
Building Tooling And Culture Together
 
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
 
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
 
NYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason YosinskiNYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason Yosinski
 
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
 
NYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeNYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden Lake
 
NYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFeeNYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFee
 
Virtual Madness @ Etsy
Virtual Madness @ EtsyVirtual Madness @ Etsy
Virtual Madness @ Etsy
 
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
NYAI - Commodity Machine Learning & Beyond by Andreas MuellerNYAI - Commodity Machine Learning & Beyond by Andreas Mueller
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Mining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started GuideMining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started Guide
 
Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social Web
 
Lessons Learned from Running Hundreds of Kaggle Competitions
Lessons Learned from Running Hundreds of Kaggle CompetitionsLessons Learned from Running Hundreds of Kaggle Competitions
Lessons Learned from Running Hundreds of Kaggle Competitions
 
What convnets look at when they look at nudity
What convnets look at when they look at nudityWhat convnets look at when they look at nudity
What convnets look at when they look at nudity
 
KAMAS Health 2.0 Presentation
KAMAS Health 2.0 PresentationKAMAS Health 2.0 Presentation
KAMAS Health 2.0 Presentation
 
Toefl speaking test preparation Lesson 9 (t)
Toefl speaking test preparation Lesson 9 (t)Toefl speaking test preparation Lesson 9 (t)
Toefl speaking test preparation Lesson 9 (t)
 
Relative pronouns
Relative  pronounsRelative  pronouns
Relative pronouns
 
DESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.in
DESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.inDESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.in
DESKTOP V/S LAPTOP - Stegin.joy@bca.christuniversity.in
 

Similar a NYAI - Interactive Machine Learning by Daniel Hsu

Machine Learning
Machine LearningMachine Learning
Machine LearningShrey Malik
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxYousef Aburawi
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdfHODIT12
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1MostafaHazemMostafaa
 
Machine learning SVM
Machine  learning SVMMachine  learning SVM
Machine learning SVMRaj Kamal
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introductionARVIND SARDAR
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Machine learning and decision trees
Machine learning and decision treesMachine learning and decision trees
Machine learning and decision treesPadma Metta
 
Bridging the gap between closed and open items or how to make CALL more intel...
Bridging the gap between closed and open items or how to make CALL more intel...Bridging the gap between closed and open items or how to make CALL more intel...
Bridging the gap between closed and open items or how to make CALL more intel...bwylin
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research OverviewKathirvel Ayyaswamy
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack BigDataExpo
 
Lect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfLect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfHassanElalfy4
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmKaniska Mandal
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialTariq King
 
Meetup 29042015
Meetup 29042015Meetup 29042015
Meetup 29042015lbishal
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningAI Summary
 

Similar a NYAI - Interactive Machine Learning by Daniel Hsu (20)

Machine Learning
Machine LearningMachine Learning
Machine Learning
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdf
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
 
Machine learning SVM
Machine  learning SVMMachine  learning SVM
Machine learning SVM
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introduction
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Week 1.pdf
Week 1.pdfWeek 1.pdf
Week 1.pdf
 
Machine learning and decision trees
Machine learning and decision treesMachine learning and decision trees
Machine learning and decision trees
 
Bridging the gap between closed and open items or how to make CALL more intel...
Bridging the gap between closed and open items or how to make CALL more intel...Bridging the gap between closed and open items or how to make CALL more intel...
Bridging the gap between closed and open items or how to make CALL more intel...
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
Lect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfLect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdf
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning Algorithm
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World Tutorial
 
Meetup 29042015
Meetup 29042015Meetup 29042015
Meetup 29042015
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Último

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Último (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

NYAI - Interactive Machine Learning by Daniel Hsu

  • 1. Interactive machine learning Daniel Hsu Columbia University 1
  • 2. Non-interactive machine learning “Cartoon” of (non-interactive) machine learning 2
  • 3. Non-interactive machine learning “Cartoon” of (non-interactive) machine learning 1. Get labeled data {(inputi , outputi )}n i=1. 2
  • 4. Non-interactive machine learning “Cartoon” of (non-interactive) machine learning 1. Get labeled data {(inputi , outputi )}n i=1. 2. Learn prediction function ˆf (e.g., classifier, regressor, policy) such that ˆf (inputi ) ≈ outputi for most i = 1, 2, . . . , n. 2
  • 5. Non-interactive machine learning “Cartoon” of (non-interactive) machine learning 1. Get labeled data {(inputi , outputi )}n i=1. 2. Learn prediction function ˆf (e.g., classifier, regressor, policy) such that ˆf (inputi ) ≈ outputi for most i = 1, 2, . . . , n. Goal: ˆf (input) ≈ output for future (input, output) pairs. 2
  • 6. Non-interactive machine learning “Cartoon” of (non-interactive) machine learning 1. Get labeled data {(inputi , outputi )}n i=1. 2. Learn prediction function ˆf (e.g., classifier, regressor, policy) such that ˆf (inputi ) ≈ outputi for most i = 1, 2, . . . , n. Goal: ˆf (input) ≈ output for future (input, output) pairs. Some applications: document classification, face detection, speech recognition, machine translation, credit rating, . . . 2
  • 7. Non-interactive machine learning “Cartoon” of (non-interactive) machine learning 1. Get labeled data {(inputi , outputi )}n i=1. 2. Learn prediction function ˆf (e.g., classifier, regressor, policy) such that ˆf (inputi ) ≈ outputi for most i = 1, 2, . . . , n. Goal: ˆf (input) ≈ output for future (input, output) pairs. Some applications: document classification, face detection, speech recognition, machine translation, credit rating, . . . There is a lot of technology for doing this, and a lot of mathematical theories for understanding this. 2
  • 8. Interactive machine learning: example #1 Practicing physician 3
  • 9. Interactive machine learning: example #1 Practicing physician Loop: 1. Patient arrives with symptoms, medical history, genome . . . 3
  • 10. Interactive machine learning: example #1 Practicing physician Loop: 1. Patient arrives with symptoms, medical history, genome . . . 2. Prescribe treatment. 3
  • 11. Interactive machine learning: example #1 Practicing physician Loop: 1. Patient arrives with symptoms, medical history, genome . . . 2. Prescribe treatment. 3. Observe impact on patient’s health (e.g., improves, worsens). 3
  • 12. Interactive machine learning: example #1 Practicing physician Loop: 1. Patient arrives with symptoms, medical history, genome . . . 2. Prescribe treatment. 3. Observe impact on patient’s health (e.g., improves, worsens). Goal prescribe treatments that yield good health outcomes. 3
  • 13. Interactive machine learning: example #2 Website operator 4
  • 14. Interactive machine learning: example #2 Website operator Loop: 1. User visits website with profile, browsing history . . . 4
  • 15. Interactive machine learning: example #2 Website operator Loop: 1. User visits website with profile, browsing history . . . 2. Choose content to display on website. 4
  • 16. Interactive machine learning: example #2 Website operator Loop: 1. User visits website with profile, browsing history . . . 2. Choose content to display on website. 3. Observe user reaction to content (e.g., click, “like”). 4
  • 17. Interactive machine learning: example #2 Website operator Loop: 1. User visits website with profile, browsing history . . . 2. Choose content to display on website. 3. Observe user reaction to content (e.g., click, “like”). Goal choose content that yield desired user behavior. 4
  • 18. Interactive machine learning: example #3 E-mail service provider 5
  • 19. Interactive machine learning: example #3 E-mail service provider Loop: 1. Receive e-mail messages for users (spam or not). 5
  • 20. Interactive machine learning: example #3 E-mail service provider Loop: 1. Receive e-mail messages for users (spam or not). 2. Ask users to provide labels for some borderline cases. 5
  • 21. Interactive machine learning: example #3 E-mail service provider Loop: 1. Receive e-mail messages for users (spam or not). 2. Ask users to provide labels for some borderline cases. 3. Improve spam filter using newly labeled messages. 5
  • 22. Interactive machine learning: example #3 E-mail service provider Loop: 1. Receive e-mail messages for users (spam or not). 2. Ask users to provide labels for some borderline cases. 3. Improve spam filter using newly labeled messages. Goal maximize accuracy of spam filter, minimize number of queries to users. 5
  • 23. Characteristics of interactive machine learning problems 6
  • 24. Characteristics of interactive machine learning problems 1. Learning agent (a.k.a. “learner”) interacts with the world (e.g., patients, users) to gather data. 6
  • 25. Characteristics of interactive machine learning problems 1. Learning agent (a.k.a. “learner”) interacts with the world (e.g., patients, users) to gather data. 2. Learner’s performance based on learner’s decisions. 6
  • 26. Characteristics of interactive machine learning problems 1. Learning agent (a.k.a. “learner”) interacts with the world (e.g., patients, users) to gather data. 2. Learner’s performance based on learner’s decisions. 3. Data available to learner depends on learner’s decisions. 6
  • 27. Characteristics of interactive machine learning problems 1. Learning agent (a.k.a. “learner”) interacts with the world (e.g., patients, users) to gather data. 2. Learner’s performance based on learner’s decisions. 3. Data available to learner depends on learner’s decisions. 4. State of the world depends on learner’s decisions. 6
  • 28. Characteristics of interactive machine learning problems 1. Learning agent (a.k.a. “learner”) interacts with the world (e.g., patients, users) to gather data. 2. Learner’s performance based on learner’s decisions. 3. Data available to learner depends on learner’s decisions. 4. State of the world depends on learner’s decisions. This talk: two interactive machine learning problems, 1. active learning, and 2. contextual bandit learning. (Our models for these problems have all but the last of above characteristics.) 6
  • 30. Motivation Recall non-interactive (a.k.a. passive) machine learning: 1. Get labeled data {(inputi , labeli )}n i=1. 2. Learn function ˆf (e.g., classifier, regressor, decision rule, policy) such that ˆf (inputi ) ≈ labeli for most i = 1, 2, . . . , n. 8
  • 31. Motivation Recall non-interactive (a.k.a. passive) machine learning: 1. Get labeled data {(inputi , labeli )}n i=1. 2. Learn function ˆf (e.g., classifier, regressor, decision rule, policy) such that ˆf (inputi ) ≈ labeli for most i = 1, 2, . . . , n. Problem: labels often much more expensive to get than inputs. E.g., can’t ask for spam/not-spam label of every e-mail. 8
  • 32. Active learning Basic structure of active learning: 9
  • 33. Active learning Basic structure of active learning: Start with pool of unlabeled inputs (and maybe some (input, label) pairs). 9
  • 34. Active learning Basic structure of active learning: Start with pool of unlabeled inputs (and maybe some (input, label) pairs). Repeat: 1. Train prediction function ˆf using current labeled pairs. 9
  • 35. Active learning Basic structure of active learning: Start with pool of unlabeled inputs (and maybe some (input, label) pairs). Repeat: 1. Train prediction function ˆf using current labeled pairs. 2. Pick some inputs from the pool, and query their labels. 9
  • 36. Active learning Basic structure of active learning: Start with pool of unlabeled inputs (and maybe some (input, label) pairs). Repeat: 1. Train prediction function ˆf using current labeled pairs. 2. Pick some inputs from the pool, and query their labels. Goal: 9
  • 37. Active learning Basic structure of active learning: Start with pool of unlabeled inputs (and maybe some (input, label) pairs). Repeat: 1. Train prediction function ˆf using current labeled pairs. 2. Pick some inputs from the pool, and query their labels. Goal: Final prediction function ˆf satisfies ˆf (input) ≈ label for most future (input, label) pairs. 9
  • 38. Active learning Basic structure of active learning: Start with pool of unlabeled inputs (and maybe some (input, label) pairs). Repeat: 1. Train prediction function ˆf using current labeled pairs. 2. Pick some inputs from the pool, and query their labels. Goal: Final prediction function ˆf satisfies ˆf (input) ≈ label for most future (input, label) pairs. Number of label queries is small. (Can be selective/adaptive about which labels to query . . . ) 9
  • 39. Sampling bias Main difficulty of active learning: sampling bias. 10
  • 40. Sampling bias Main difficulty of active learning: sampling bias. At any point in active learning process, labeled data set typically not representative of overall data. 10
  • 41. Sampling bias Main difficulty of active learning: sampling bias. At any point in active learning process, labeled data set typically not representative of overall data. Problem can derail learning and go unnoticed! 10
  • 42. Sampling bias Main difficulty of active learning: sampling bias. At any point in active learning process, labeled data set typically not representative of overall data. Problem can derail learning and go unnoticed! input ∈ R, label ∈ {0, 1}, classifier type = threshold functions. 45%45% 5%5% 10
  • 43. Sampling bias Typical active learning heuristic: Start with pool of unlabeled inputs. input ∈ R, label ∈ {0, 1}, classifier type = threshold functions. 45%45% 5%5% 10
  • 44. Sampling bias Typical active learning heuristic: Start with pool of unlabeled inputs. Pick a handful of inputs, and query their labels. input ∈ R, label ∈ {0, 1}, classifier type = threshold functions. 45%45% 5%5% 10
  • 45. Sampling bias Typical active learning heuristic: Start with pool of unlabeled inputs. Pick a handful of inputs, and query their labels. Repeat: 1. Train classifier ˆf using current labeled pairs. input ∈ R, label ∈ {0, 1}, classifier type = threshold functions. 45%45% 5%5% 10
  • 46. Sampling bias Typical active learning heuristic: Start with pool of unlabeled inputs. Pick a handful of inputs, and query their labels. Repeat: 1. Train classifier ˆf using current labeled pairs. 2. Pick input from pool closest to decision boundary of ˆf , and query its label. input ∈ R, label ∈ {0, 1}, classifier type = threshold functions. 45%45% 5%5% 10
  • 47. Sampling bias Typical active learning heuristic: Start with pool of unlabeled inputs. Pick a handful of inputs, and query their labels. Repeat: 1. Train classifier ˆf using current labeled pairs. 2. Pick input from pool closest to decision boundary of ˆf , and query its label. input ∈ R, label ∈ {0, 1}, classifier type = threshold functions. 45%45% 5%5% Learner converges to classifier with 5% error rate, even though there’s a classifier with 2.5% error rate. Not statistically consistent! 10
  • 48. Framework for statistically consistent active learning Importance weighted active learning (Beygelzimer, Dasgupta, and Langford, ICML 2009) 11
  • 49. Framework for statistically consistent active learning Importance weighted active learning (Beygelzimer, Dasgupta, and Langford, ICML 2009) Start with unlabeled pool {inputi }n i=1. 11
  • 50. Framework for statistically consistent active learning Importance weighted active learning (Beygelzimer, Dasgupta, and Langford, ICML 2009) Start with unlabeled pool {inputi }n i=1. For each i = 1, 2, . . . , n (in random order): 1. Get inputi . 11
  • 51. Framework for statistically consistent active learning Importance weighted active learning (Beygelzimer, Dasgupta, and Langford, ICML 2009) Start with unlabeled pool {inputi }n i=1. For each i = 1, 2, . . . , n (in random order): 1. Get inputi . 2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi . 11
  • 52. Framework for statistically consistent active learning Importance weighted active learning (Beygelzimer, Dasgupta, and Langford, ICML 2009) Start with unlabeled pool {inputi }n i=1. For each i = 1, 2, . . . , n (in random order): 1. Get inputi . 2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi . 3. If heads, query labeli , and include (inputi , labeli ) with importance weight 1/pi in data set. 11
  • 53. Framework for statistically consistent active learning Importance weighted active learning (Beygelzimer, Dasgupta, and Langford, ICML 2009) Start with unlabeled pool {inputi }n i=1. For each i = 1, 2, . . . , n (in random order): 1. Get inputi . 2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi . 3. If heads, query labeli , and include (inputi , labeli ) with importance weight 1/pi in data set. Importance weight = “inverse propensity” that labeli is queried. Used to account for sampling bias in error rate estimates. 11
  • 54. Importance weighted active learning algorithms How to choose pi ? 12
  • 55. Importance weighted active learning algorithms How to choose pi ? 1. Use your favorite heuristic (as long as pi is never too small). 12
  • 56. Importance weighted active learning algorithms How to choose pi ? 1. Use your favorite heuristic (as long as pi is never too small). 2. Simple rule based on estimated error rate differences (Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010). 12
  • 57. Importance weighted active learning algorithms How to choose pi ? 1. Use your favorite heuristic (as long as pi is never too small). 2. Simple rule based on estimated error rate differences (Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010). In many cases, can prove the following: Final classifier is as accurate as learner that queries all n labels. Expected number of queries n i=1 pi grows sub-linearly with n. 12
  • 58. Importance weighted active learning algorithms How to choose pi ? 1. Use your favorite heuristic (as long as pi is never too small). 2. Simple rule based on estimated error rate differences (Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010). In many cases, can prove the following: Final classifier is as accurate as learner that queries all n labels. Expected number of queries n i=1 pi grows sub-linearly with n. 3. Solve a large convex optimization problem (Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015). 12
  • 59. Importance weighted active learning algorithms How to choose pi ? 1. Use your favorite heuristic (as long as pi is never too small). 2. Simple rule based on estimated error rate differences (Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010). In many cases, can prove the following: Final classifier is as accurate as learner that queries all n labels. Expected number of queries n i=1 pi grows sub-linearly with n. 3. Solve a large convex optimization problem (Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015). Even stronger theoretical guarantees. 12
  • 60. Importance weighted active learning algorithms How to choose pi ? 1. Use your favorite heuristic (as long as pi is never too small). 2. Simple rule based on estimated error rate differences (Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010). In many cases, can prove the following: Final classifier is as accurate as learner that queries all n labels. Expected number of queries n i=1 pi grows sub-linearly with n. 3. Solve a large convex optimization problem (Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015). Even stronger theoretical guarantees. Latter two methods can be made to work with any classifier type, provided a good (non-interactive) learning algorithm. 12
  • 63. Contextual bandit learning Protocol for contextual bandit learning: For round t = 1, 2, . . . , T: 14
  • 64. Contextual bandit learning Protocol for contextual bandit learning: For round t = 1, 2, . . . , T: 1. Observe contextt. [e.g., user profile] 14
  • 65. Contextual bandit learning Protocol for contextual bandit learning: For round t = 1, 2, . . . , T: 1. Observe contextt. [e.g., user profile] 2. Choose actiont ∈ Actions. [e.g., display ad] 14
  • 66. Contextual bandit learning Protocol for contextual bandit learning: For round t = 1, 2, . . . , T: 1. Observe contextt. [e.g., user profile] 2. Choose actiont ∈ Actions. [e.g., display ad] 3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click] 14
  • 67. Contextual bandit learning Protocol for contextual bandit learning: For round t = 1, 2, . . . , T: 1. Observe contextt. [e.g., user profile] 2. Choose actiont ∈ Actions. [e.g., display ad] 3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click] Goal: Choose actions that yield high reward. 14
  • 68. Contextual bandit learning Protocol for contextual bandit learning: For round t = 1, 2, . . . , T: 1. Observe contextt. [e.g., user profile] 2. Choose actiont ∈ Actions. [e.g., display ad] 3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click] Goal: Choose actions that yield high reward. Note: In round t, only observe reward for chosen actiont, and not reward that you would’ve received if you’d chosen a different action. 14
  • 69. Challenges in contextual bandit learning 15
  • 70. Challenges in contextual bandit learning 1. Explore vs. exploit: Should use what you’ve already learned about actions. Must also learn about new actions that could be good. 15
  • 71. Challenges in contextual bandit learning 1. Explore vs. exploit: Should use what you’ve already learned about actions. Must also learn about new actions that could be good. 2. Must use context effectively. Which action is good likely to depend on current context. Want to do as well as the best policy (i.e., decision rule) π: context → action from some rich class of policies Π. 15
  • 72. Challenges in contextual bandit learning 1. Explore vs. exploit: Should use what you’ve already learned about actions. Must also learn about new actions that could be good. 2. Must use context effectively. Which action is good likely to depend on current context. Want to do as well as the best policy (i.e., decision rule) π: context → action from some rich class of policies Π. 3. Selection bias, especially when exploiting. 15
  • 73. Why is exploration necessary? No-exploration approach: 16
  • 74. Why is exploration necessary? No-exploration approach: 1. Using historical data, learn a “reward predictor” for each action based on context: reward(action | context) . 16
  • 75. Why is exploration necessary? No-exploration approach: 1. Using historical data, learn a “reward predictor” for each action based on context: reward(action | context) . 2. Then deploy policy ˆπ, given by ˆπ(context) := arg max action∈Actions reward(action | context) , and collect more data. 16
  • 76. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. 17
  • 77. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 17
  • 78. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.5 0.1 17
  • 79. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.5 0.1 New policy: ˆπ (x1) = ˆπ (x2) = a1. 17
  • 80. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.5 0.1 New policy: ˆπ (x1) = ˆπ (x2) = a1. Observed rewards a1 a2 x1 0.7 — x2 0.3 0.1 17
  • 81. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.5 0.1 New policy: ˆπ (x1) = ˆπ (x2) = a1. Observed rewards a1 a2 x1 0.7 — x2 0.3 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.3 0.1 17
  • 82. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.5 0.1 New policy: ˆπ (x1) = ˆπ (x2) = a1. Observed rewards a1 a2 x1 0.7 — x2 0.3 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.3 0.1 Never try a2 in context x1. 17
  • 83. Using no-exploration Example: two actions {a1, a2}, two contexts {x1, x2}. Suppose initial policy says ˆπ(x1) = a1 and ˆπ(x2) = a2. Observed rewards a1 a2 x1 0.7 — x2 — 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.5 0.1 New policy: ˆπ (x1) = ˆπ (x2) = a1. Observed rewards a1 a2 x1 0.7 — x2 0.3 0.1 Reward estimates a1 a2 x1 0.7 0.5 x2 0.3 0.1 True rewards a1 a2 x1 0.7 1.0 x2 0.3 0.1 Never try a2 in context x1. Highly sub-optimal. 17
  • 84. Framework for simultaneous exploration and exploitation 18
  • 85. Framework for simultaneous exploration and exploitation Initialize distribution W over some policies. For t = 1, 2, . . . , T: 18
  • 86. Framework for simultaneous exploration and exploitation Initialize distribution W over some policies. For t = 1, 2, . . . , T: 1. Observe contextt. 18
  • 87. Framework for simultaneous exploration and exploitation Initialize distribution W over some policies. For t = 1, 2, . . . , T: 1. Observe contextt. 2. Randomly pick policy ˆπ ∼ W , choose ˆπ(contextt) ∈ Actions. 18
  • 88. Framework for simultaneous exploration and exploitation Initialize distribution W over some policies. For t = 1, 2, . . . , T: 1. Observe contextt. 2. Randomly pick policy ˆπ ∼ W , choose ˆπ(contextt) ∈ Actions. 3. Collect rewardt(actiont) ∈ [0, 1], update distribution W . 18
  • 89. Framework for simultaneous exploration and exploitation Initialize distribution W over some policies. For t = 1, 2, . . . , T: 1. Observe contextt. 2. Randomly pick policy ˆπ ∼ W , choose ˆπ(contextt) ∈ Actions. 3. Collect rewardt(actiont) ∈ [0, 1], update distribution W . Can use “inverse propensity” importance weights to account for sampling bias when estimating expected rewards of policies. 18
  • 90. Algorithms for contextual bandit learning How to choose the distribution W over policies? 19
  • 91. Algorithms for contextual bandit learning How to choose the distribution W over policies? Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995) Maintains exponential weights over all policies in a policy class Π. 19
  • 92. Algorithms for contextual bandit learning How to choose the distribution W over policies? Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995) Maintains exponential weights over all policies in a policy class Π. Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, and Zhang, UAI 2011) Solves optimization problem to come up with poly(T, log |Π|)-sparse weights over policies. 19
  • 93. Algorithms for contextual bandit learning How to choose the distribution W over policies? Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995) Maintains exponential weights over all policies in a policy class Π. Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, and Zhang, UAI 2011) Solves optimization problem to come up with poly(T, log |Π|)-sparse weights over policies. Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014) Simpler and faster algorithm to solve optimization problem; get much sparser weights over policies. 19
  • 94. Algorithms for contextual bandit learning How to choose the distribution W over policies? Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995) Maintains exponential weights over all policies in a policy class Π. Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, and Zhang, UAI 2011) Solves optimization problem to come up with poly(T, log |Π|)-sparse weights over policies. Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014) Simpler and faster algorithm to solve optimization problem; get much sparser weights over policies. Latter two methods rely on a good (non-interactive) learning algorithm for policy class Π. 19
  • 96. Wrap-up Interactive machine learning models better capture how machine learning is used in many applications (than non-interactive frameworks). 20
  • 97. Wrap-up Interactive machine learning models better capture how machine learning is used in many applications (than non-interactive frameworks). Sampling bias is a pervasive issue. 20
  • 98. Wrap-up Interactive machine learning models better capture how machine learning is used in many applications (than non-interactive frameworks). Sampling bias is a pervasive issue. Research: how to use non-interactive machine learning technology in these new interactive settings. 20
  • 99. Wrap-up Interactive machine learning models better capture how machine learning is used in many applications (than non-interactive frameworks). Sampling bias is a pervasive issue. Research: how to use non-interactive machine learning technology in these new interactive settings. Thanks! 20