SlideShare a Scribd company logo
1 of 29
FUN WITH BANDIT PROBLEMS ?
By Shweta Gupte
Psy606: Human Problem Solving
Spring2013
Introduction
Types of Bandit Problems
Application
The Paper
2/8/2016
Psy606:HumanProblemSolvingPurdue
University
2
INTRODUCTION
2/8/2016
3
Psy606:HumanProblemSolvingPurdue
University
Basic Form:
People choose repeatedly
between a small number
of alternatives each of which has an
unknown rate of providing
reward.
History:
Robbins(1952) constructed convergent
population selection
strategies in sequential decision making.
2/8/2016
Psy606:HumanProblemSolvingPurdue
University
4
2/8/2016
Psy606:HumanProblemSolvingPurdue
University
5
Bad Good
Bad
2 more days to go …
TYPES OF BANDIT PROBLEM
2/8/2016
6
Psy606:HumanProblemSolvingPurdue
University
Stationary
Fixed
Horizon
Infinite
Horizon
Dynamic
(Restless)
Fixed
Horizon
Infinite
Horizon
TYPES OF BANDIT PROBLEMS
2/8/2016
7
Psy606:HumanProblemSolvingPurdue
University
One-armed
refers to a choice between an option with
a known payout versus a different option
with an unknown payout.
TYPES OF BANDIT PROBLEMS
2/8/2016
8
Psy606:HumanProblemSolvingPurdue
University
Multi-armed
refer to the situation where there
are multiple unknown alternatives.
APPLICATIONS
 Problem of managing research projects
 Stock Market
 Sport coaches track changes in team performance
 Drivers choosing number of routes
2/8/2016
9
Psy606:HumanProblemSolvingPurdue
University
EXPLORATION AND EXPLOITATION
 Exploration-
Selection task is to get information about the hidden
arms.
 Exploitation-
a focus on a single arm, in order to obtain rewards
from an option that is believed to be sufficiently good
as compared to the other competing options.
Expected behavior :
Exploration Exploitation.
2/8/2016
10
Psy606:HumanProblemSolvingPurdue
University
SHORT SUMMARY
 Stationary Bandit Problem-
The reward rate for each alternative is kept constant over
all of the trials.
The number of trials in each game may be known,
creating a finite horizon problem, or unknown, creating an
infinite horizon problem.
 Optimal solutions can be found for all cases in
finite horizon environments by using a dynamic
programming approach, where optimal decisions
are computed for all potential cases starting from
the final trial and solving for each trial toward the
first (Kaelbling et al., 1996).
2/8/2016
11
Psy606:HumanProblemSolvingPurdue
University
SHORT SUMMARY
 As the length of a game increases or the number of
alternatives increases, the computation necessary
to create a complete decision tree increases
exponentially.
 Restless Bandit Problems-
 The rewards rates for alternatives may change over
time, rather than remaining stationary through each
trail of the game.
 Change detection, forcing a switch between
exploration and exploitation
2/8/2016
12
Psy606:HumanProblemSolvingPurdue
University
OPTIMAL SOLUTIONS VERSUS HEURISTICS
 Tend to be fairly ponderous in terms of
computational cost and often can only be applied in
limited situations.
 Heuristics are geared towards obtaining
performance that, while not optimal, is still good but
with comparatively much less work.
 Of course, there are also models that fall between
the two extremes in complexity - the particle filter
model used in paper can't really be counted in
either of the two groups
2/8/2016
13
Psy606:HumanProblemSolvingPurdue
University
GITTINS INDEX?
A Gittins index gives each alternative an utility that
takes into account an alternative’s current estimated
value and the information that can be gained from
choosing the alternative; the optimal
decision is the arm which has the largest index value.
Gittins indices are only applicable to a limited number
of bandit problems, and can be difficult to compute
even in those cases (Berry & Fristedt, 1985).
2/8/2016
14
Psy606:HumanProblemSolvingPurdue
University
GITTINS INDEX?
 The Gittins index is a measure of the reward that
can be achieved by a process evolving from its
present state onwards with the probability that it will
be terminated in the future.
 It is a real scalar value associated to the state of a
stochastic process with a reward function and with
a probability of termination.
2/8/2016
15
Psy606:HumanProblemSolvingPurdue
University
TSP AND BANDIT PROBLEMS ?SAME?
 Bandit problems are highly sequential, where
information you gain on each trial can be used to
inform your decisions on subsequent trials.
 TSPs are spatial tasks where generally, all
information is available at the outset of the task.
The connections you make between nodes on each
step are really only sequential in the sense that
they aren't made simultaneously.
2/8/2016
16
Psy606:HumanProblemSolvingPurdue
University
AUTHORS’ MOTIVATION?
When optimal solutions are available, bandit problems
provide an opportunity to examine whether or how people
make the best possible decisions.
For this reason, many previous empirical studies have
been motivated by economic theories, with a focus on
deviations from rationality in human decision-making (e.g.,
Banks, Olson, & Porter, 1997;Meyer & Shi, 1995).
 More recently, human performance on the bandit
problem has been studied within cognitive neuroscience
(e.g., Cohen, McClure, &Yu, 2007; Daw, O’Doherty,
Dayan, Seymour, & Dolan, 2006) and probabilistic models
of human cognition (e.g., Steyvers,
Lee, & Wagenmakers, 2009).
2/8/2016
17
Psy606:HumanProblemSolvingPurdue
University
PARTICLE FILTERS
 http://www.youtube.com/watch?v=O-lAJVra1PU
2/8/2016
18
Psy606:HumanProblemSolvingPurdue
University
Particle Filter MCMC
Depending on the design need less
computation time
More computation time with
increasing information
A sophisticated model estimation
technique based on simulation.
Particle filters are usually used to
estimate Bayesian models in which
the latent variables are connected
in a Markov chain
A class of algorithms for sampling
from probability distributions based
on constructing a Markov chain that
has the desired distribution as its
equilibrium distribution.
Estimate only the distribution of only
one of the latent variables at a time,
rather than attempting to estimate
them all at once, and produce a set
of weighted samples, rather than a
(usually much larger) set of
unweighted samples.
THE PAPER
Modeling Human Performance ?
in Restless Bandits ?
with Particle Filters?
(Fall 2009)
2/8/2016
19
Psy606:HumanProblemSolvingPurdue
University
EXPERIMENT 1
 Restless bandit problem is an extension of
sequential stationary infinite-horizon problems.
 The behavior of human participants in restless
bandit environment is observed and compared to
two different particle filter methods of solutions.
 Optimal ,other sub optimal
 27 participants ,UCI, course credit
2/8/2016
20
Psy606:HumanProblemSolvingPurdue
University
EXPERIMENT 1
2/8/2016
21
Psy606:HumanProblemSolvingPurdue
University
INTERFACE
2/8/2016
22
Psy606:HumanProblemSolvingPurdue
University
RESULTS
2/8/2016
23
Psy606:HumanProblemSolvingPurdue
University
OVER ALL CONCLUSIONS?
Many potential applications:
 Clinical trials
 Advertising: what ad to put on a web-page?
 Labor markets: which job a worker should choose?
 Optimization of noisy function
 Numerical resource allocation
2/8/2016
24
Psy606:HumanProblemSolvingPurdue
University
OVER ALL CONCLUSION
How to solve:
 Monte Carlo, Markov chain, Particle filter
 Use Gittens index
Paper
 focuses on human performance and not optimal
solution, does not use Gittens index
2/8/2016
25
Psy606:HumanProblemSolvingPurdue
University
ACKNOWLEDGEMENTS
2/8/2016
26
Psy606:HumanProblemSolvingPurdue
University
Sheng Kung M. Yi, Mark Steyvers and Michael Lee
REFERENCES?
 Robbins, H. (1952). "Some aspects of the sequential design of
experiments". Bulletin of the American Mathematical Society 58 (5):
527–535
 Berry, Donald A. and Fristedt, Bert (1985. viii+275). Bandit problems:
Sequential allocation of experiments. Monographs on Statistics and
Applied Probability. London: Chapman & Hall. ISBN 0-412-24810-7.
 Gittins, J.C. (1989). Multi-armed bandit allocation indices. Wiley-
Interscience Series in Systems and Optimization.. Chichester: John
Wiley & Sons, Ltd.. ISBN 0-471-92059-2.
 Doucet, A.; De Freitas, N.; Gordon, N.J. (2001). Sequential Monte
Carlo Methods in Practice. Springer.
2/8/2016
27
Psy606:HumanProblemSolvingPurdue
University
QUESTIONS?
That’s All Folks!
How do we make Money?
If we understand this model well ,Vegas is waiting!
2/8/2016
28
Psy606:HumanProblemSolvingPurdue
University
EXAMPLE
2/8/2016
29
Psy606:HumanProblemSolvingPurdue
University
1
3
4
0

More Related Content

Similar to BanditProblems_final

Data Responsibly: The next decade of data science
Data Responsibly: The next decade of data scienceData Responsibly: The next decade of data science
Data Responsibly: The next decade of data scienceUniversity of Washington
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceUniversity of Washington
 
Thailand Policy Foresight in Covid-19 Era
Thailand Policy Foresight in Covid-19 EraThailand Policy Foresight in Covid-19 Era
Thailand Policy Foresight in Covid-19 EraKan Yuenyong
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationBenjamin Good
 
We Have Met the Enemy and They Are Us | John Powell | Hypergiant
We Have Met the Enemy and They Are Us | John Powell | HypergiantWe Have Met the Enemy and They Are Us | John Powell | Hypergiant
We Have Met the Enemy and They Are Us | John Powell | HypergiantService Design Network
 
Open innovation and artificial intelligence: Can OpenAI benefit humanity?
Open innovation and artificial intelligence: Can OpenAI benefit humanity?Open innovation and artificial intelligence: Can OpenAI benefit humanity?
Open innovation and artificial intelligence: Can OpenAI benefit humanity?Kasper Groes Ludvigsen
 
All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...
All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...
All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...Asia Smith
 
Ethics and sustainability for techies
Ethics and sustainability for techiesEthics and sustainability for techies
Ethics and sustainability for techiesClaudia Melo
 
Using agent-based simulation for socio-ecological uncertainty analysis
Using agent-based simulation for socio-ecological uncertainty analysisUsing agent-based simulation for socio-ecological uncertainty analysis
Using agent-based simulation for socio-ecological uncertainty analysisBruce Edmonds
 
DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1AlyciaGold776
 
Reproducibility, preregistration, etc.: Making good science even better
Reproducibility,  preregistration, etc.: Making good science even betterReproducibility,  preregistration, etc.: Making good science even better
Reproducibility, preregistration, etc.: Making good science even betterAlex Holcombe
 
The ABC of Evidence-Base Medicine
The ABC of Evidence-Base MedicineThe ABC of Evidence-Base Medicine
The ABC of Evidence-Base MedicineDr Max Mongelli
 
Evaluation of the Quality of a Prognosis for an Industrial Product using the ...
Evaluation of the Quality of a Prognosis for an Industrial Product using the ...Evaluation of the Quality of a Prognosis for an Industrial Product using the ...
Evaluation of the Quality of a Prognosis for an Industrial Product using the ...ijtsrd
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Academia Sinica
 

Similar to BanditProblems_final (20)

Nursing research
Nursing researchNursing research
Nursing research
 
Decision Theory Research at FRI
Decision Theory Research at FRIDecision Theory Research at FRI
Decision Theory Research at FRI
 
Data Responsibly: The next decade of data science
Data Responsibly: The next decade of data scienceData Responsibly: The next decade of data science
Data Responsibly: The next decade of data science
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
Thailand Policy Foresight in Covid-19 Era
Thailand Policy Foresight in Covid-19 EraThailand Policy Foresight in Covid-19 Era
Thailand Policy Foresight in Covid-19 Era
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
 
We Have Met the Enemy and They Are Us | John Powell | Hypergiant
We Have Met the Enemy and They Are Us | John Powell | HypergiantWe Have Met the Enemy and They Are Us | John Powell | Hypergiant
We Have Met the Enemy and They Are Us | John Powell | Hypergiant
 
Open innovation and artificial intelligence: Can OpenAI benefit humanity?
Open innovation and artificial intelligence: Can OpenAI benefit humanity?Open innovation and artificial intelligence: Can OpenAI benefit humanity?
Open innovation and artificial intelligence: Can OpenAI benefit humanity?
 
Sociophysics
SociophysicsSociophysics
Sociophysics
 
All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...
All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...
All That Glitters Is Not Gold. The Political Economy Of Randomized Evaluation...
 
Ethics and sustainability for techies
Ethics and sustainability for techiesEthics and sustainability for techies
Ethics and sustainability for techies
 
Using agent-based simulation for socio-ecological uncertainty analysis
Using agent-based simulation for socio-ecological uncertainty analysisUsing agent-based simulation for socio-ecological uncertainty analysis
Using agent-based simulation for socio-ecological uncertainty analysis
 
DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1
 
Reproducibility, preregistration, etc.: Making good science even better
Reproducibility,  preregistration, etc.: Making good science even betterReproducibility,  preregistration, etc.: Making good science even better
Reproducibility, preregistration, etc.: Making good science even better
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
The ABC of Evidence-Base Medicine
The ABC of Evidence-Base MedicineThe ABC of Evidence-Base Medicine
The ABC of Evidence-Base Medicine
 
Evaluation of the Quality of a Prognosis for an Industrial Product using the ...
Evaluation of the Quality of a Prognosis for an Industrial Product using the ...Evaluation of the Quality of a Prognosis for an Industrial Product using the ...
Evaluation of the Quality of a Prognosis for an Industrial Product using the ...
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
resume-19-11-2015
resume-19-11-2015resume-19-11-2015
resume-19-11-2015
 
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
 

More from Shweta Gupte

Insight Problem Solving (1) (1)
Insight Problem Solving (1) (1)Insight Problem Solving (1) (1)
Insight Problem Solving (1) (1)Shweta Gupte
 
mathpsy2012 poster_Shweta_3(1)
mathpsy2012 poster_Shweta_3(1)mathpsy2012 poster_Shweta_3(1)
mathpsy2012 poster_Shweta_3(1)Shweta Gupte
 
ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)Shweta Gupte
 
Gupte - first year paper_approved (1)
Gupte - first year paper_approved (1)Gupte - first year paper_approved (1)
Gupte - first year paper_approved (1)Shweta Gupte
 
DataViz_What_How_Why
DataViz_What_How_WhyDataViz_What_How_Why
DataViz_What_How_WhyShweta Gupte
 

More from Shweta Gupte (6)

Insight Problem Solving (1) (1)
Insight Problem Solving (1) (1)Insight Problem Solving (1) (1)
Insight Problem Solving (1) (1)
 
mathpsy2012 poster_Shweta_3(1)
mathpsy2012 poster_Shweta_3(1)mathpsy2012 poster_Shweta_3(1)
mathpsy2012 poster_Shweta_3(1)
 
EV-UCD
EV-UCDEV-UCD
EV-UCD
 
ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)
 
Gupte - first year paper_approved (1)
Gupte - first year paper_approved (1)Gupte - first year paper_approved (1)
Gupte - first year paper_approved (1)
 
DataViz_What_How_Why
DataViz_What_How_WhyDataViz_What_How_Why
DataViz_What_How_Why
 

BanditProblems_final

  • 1. FUN WITH BANDIT PROBLEMS ? By Shweta Gupte Psy606: Human Problem Solving Spring2013
  • 2. Introduction Types of Bandit Problems Application The Paper 2/8/2016 Psy606:HumanProblemSolvingPurdue University 2
  • 3. INTRODUCTION 2/8/2016 3 Psy606:HumanProblemSolvingPurdue University Basic Form: People choose repeatedly between a small number of alternatives each of which has an unknown rate of providing reward. History: Robbins(1952) constructed convergent population selection strategies in sequential decision making.
  • 6. TYPES OF BANDIT PROBLEM 2/8/2016 6 Psy606:HumanProblemSolvingPurdue University Stationary Fixed Horizon Infinite Horizon Dynamic (Restless) Fixed Horizon Infinite Horizon
  • 7. TYPES OF BANDIT PROBLEMS 2/8/2016 7 Psy606:HumanProblemSolvingPurdue University One-armed refers to a choice between an option with a known payout versus a different option with an unknown payout.
  • 8. TYPES OF BANDIT PROBLEMS 2/8/2016 8 Psy606:HumanProblemSolvingPurdue University Multi-armed refer to the situation where there are multiple unknown alternatives.
  • 9. APPLICATIONS  Problem of managing research projects  Stock Market  Sport coaches track changes in team performance  Drivers choosing number of routes 2/8/2016 9 Psy606:HumanProblemSolvingPurdue University
  • 10. EXPLORATION AND EXPLOITATION  Exploration- Selection task is to get information about the hidden arms.  Exploitation- a focus on a single arm, in order to obtain rewards from an option that is believed to be sufficiently good as compared to the other competing options. Expected behavior : Exploration Exploitation. 2/8/2016 10 Psy606:HumanProblemSolvingPurdue University
  • 11. SHORT SUMMARY  Stationary Bandit Problem- The reward rate for each alternative is kept constant over all of the trials. The number of trials in each game may be known, creating a finite horizon problem, or unknown, creating an infinite horizon problem.  Optimal solutions can be found for all cases in finite horizon environments by using a dynamic programming approach, where optimal decisions are computed for all potential cases starting from the final trial and solving for each trial toward the first (Kaelbling et al., 1996). 2/8/2016 11 Psy606:HumanProblemSolvingPurdue University
  • 12. SHORT SUMMARY  As the length of a game increases or the number of alternatives increases, the computation necessary to create a complete decision tree increases exponentially.  Restless Bandit Problems-  The rewards rates for alternatives may change over time, rather than remaining stationary through each trail of the game.  Change detection, forcing a switch between exploration and exploitation 2/8/2016 12 Psy606:HumanProblemSolvingPurdue University
  • 13. OPTIMAL SOLUTIONS VERSUS HEURISTICS  Tend to be fairly ponderous in terms of computational cost and often can only be applied in limited situations.  Heuristics are geared towards obtaining performance that, while not optimal, is still good but with comparatively much less work.  Of course, there are also models that fall between the two extremes in complexity - the particle filter model used in paper can't really be counted in either of the two groups 2/8/2016 13 Psy606:HumanProblemSolvingPurdue University
  • 14. GITTINS INDEX? A Gittins index gives each alternative an utility that takes into account an alternative’s current estimated value and the information that can be gained from choosing the alternative; the optimal decision is the arm which has the largest index value. Gittins indices are only applicable to a limited number of bandit problems, and can be difficult to compute even in those cases (Berry & Fristedt, 1985). 2/8/2016 14 Psy606:HumanProblemSolvingPurdue University
  • 15. GITTINS INDEX?  The Gittins index is a measure of the reward that can be achieved by a process evolving from its present state onwards with the probability that it will be terminated in the future.  It is a real scalar value associated to the state of a stochastic process with a reward function and with a probability of termination. 2/8/2016 15 Psy606:HumanProblemSolvingPurdue University
  • 16. TSP AND BANDIT PROBLEMS ?SAME?  Bandit problems are highly sequential, where information you gain on each trial can be used to inform your decisions on subsequent trials.  TSPs are spatial tasks where generally, all information is available at the outset of the task. The connections you make between nodes on each step are really only sequential in the sense that they aren't made simultaneously. 2/8/2016 16 Psy606:HumanProblemSolvingPurdue University
  • 17. AUTHORS’ MOTIVATION? When optimal solutions are available, bandit problems provide an opportunity to examine whether or how people make the best possible decisions. For this reason, many previous empirical studies have been motivated by economic theories, with a focus on deviations from rationality in human decision-making (e.g., Banks, Olson, & Porter, 1997;Meyer & Shi, 1995).  More recently, human performance on the bandit problem has been studied within cognitive neuroscience (e.g., Cohen, McClure, &Yu, 2007; Daw, O’Doherty, Dayan, Seymour, & Dolan, 2006) and probabilistic models of human cognition (e.g., Steyvers, Lee, & Wagenmakers, 2009). 2/8/2016 17 Psy606:HumanProblemSolvingPurdue University
  • 18. PARTICLE FILTERS  http://www.youtube.com/watch?v=O-lAJVra1PU 2/8/2016 18 Psy606:HumanProblemSolvingPurdue University Particle Filter MCMC Depending on the design need less computation time More computation time with increasing information A sophisticated model estimation technique based on simulation. Particle filters are usually used to estimate Bayesian models in which the latent variables are connected in a Markov chain A class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. Estimate only the distribution of only one of the latent variables at a time, rather than attempting to estimate them all at once, and produce a set of weighted samples, rather than a (usually much larger) set of unweighted samples.
  • 19. THE PAPER Modeling Human Performance ? in Restless Bandits ? with Particle Filters? (Fall 2009) 2/8/2016 19 Psy606:HumanProblemSolvingPurdue University
  • 20. EXPERIMENT 1  Restless bandit problem is an extension of sequential stationary infinite-horizon problems.  The behavior of human participants in restless bandit environment is observed and compared to two different particle filter methods of solutions.  Optimal ,other sub optimal  27 participants ,UCI, course credit 2/8/2016 20 Psy606:HumanProblemSolvingPurdue University
  • 24. OVER ALL CONCLUSIONS? Many potential applications:  Clinical trials  Advertising: what ad to put on a web-page?  Labor markets: which job a worker should choose?  Optimization of noisy function  Numerical resource allocation 2/8/2016 24 Psy606:HumanProblemSolvingPurdue University
  • 25. OVER ALL CONCLUSION How to solve:  Monte Carlo, Markov chain, Particle filter  Use Gittens index Paper  focuses on human performance and not optimal solution, does not use Gittens index 2/8/2016 25 Psy606:HumanProblemSolvingPurdue University
  • 27. REFERENCES?  Robbins, H. (1952). "Some aspects of the sequential design of experiments". Bulletin of the American Mathematical Society 58 (5): 527–535  Berry, Donald A. and Fristedt, Bert (1985. viii+275). Bandit problems: Sequential allocation of experiments. Monographs on Statistics and Applied Probability. London: Chapman & Hall. ISBN 0-412-24810-7.  Gittins, J.C. (1989). Multi-armed bandit allocation indices. Wiley- Interscience Series in Systems and Optimization.. Chichester: John Wiley & Sons, Ltd.. ISBN 0-471-92059-2.  Doucet, A.; De Freitas, N.; Gordon, N.J. (2001). Sequential Monte Carlo Methods in Practice. Springer. 2/8/2016 27 Psy606:HumanProblemSolvingPurdue University
  • 28. QUESTIONS? That’s All Folks! How do we make Money? If we understand this model well ,Vegas is waiting! 2/8/2016 28 Psy606:HumanProblemSolvingPurdue University