https://www.insight-centre.org/content/multi-armed-bandit-approach-online-spatial-task-assignment
Presented at UIC 2014
Abstract
Spatial crowdsourcing uses workers for performing tasks that require travel to different locations in the physical world. This paper considers the online spatial task assignment problem. In this problem, spatial tasks arrive in an online manner and an appropriate worker must be assigned to each task. However, outcome of an assignment is stochastic since the worker can choose to accept or reject the task. Primary goal of the assignment algorithm is to maximize the number of successful assignments over all tasks. This presents an exploration-exploitation challenge; the algorithm must learn the task acceptance behavior of workers while selecting the best worker based on the previous learning. We address this challenge by defining a framework for online spatial task assignment based on the multi-armed bandit formalization of the problem. Furthermore, we adapt a contextual bandit algorithm to assign a worker based on the spatial features
of tasks and workers. The algorithm simultaneously adapts
the worker assignment strategy based on the observed task
acceptance behavior of workers. Finally, we present an evaluation
methodology based on a real world dataset, and evaluate the
performance of the proposed algorithm against the baseline
algorithms. The results demonstrate that the proposed algorithm
performs better in terms of the number of successful assignments.
A Multi-armed Bandit Approach to Online Spatial Task Assignment
1. Umair ul Hassan, Edward Curry
A Multi-armed Bandit Approach to
Online Spatial Task Assignment
11th IEEE International Conference on Ubiquitous Intelligence and Computing
December 9-12, 2014
Ayodya Resort, Bali, Indonesia
6. Assumptions
A1) Workers do not actively visit the platform to seek tasks.
A2) Tasks arrivals are dynamic: task arrival time is unknown
A3) The outcome of an assignment is stochastic: worker may
accept or reject an assigned task
4 July 20166
Please provide recent photos of this place.
7. IMIRT Framework
Intelligent Models for Iterative Routing of Tasks (A1)
4 July 20167
IMIRT Framework
Router
ProfilerWorker
Models
Interface
assignment
update
outcomeexpectation
wj ti
wj-1
ti+1
wj+1
ti+2
Find The Best
Assignment
8. Offline Task Assignment
4 July 20168
Assignment
indicator
Acceptance
inidcator
Optimization
Constraints
Optimization
Objective
Number of
tasks
Number of
workers
9. Online Task Assignment
Either tasks or workers arrive dynamically (A2)
Existing research
4 July 20169
10. Online Task Assignment
Assignment under uncertainty (A3)
Workers may accept or reject an assigned task
Exploration-Exploitation Trade-off
4 July 201610
Select worker to learn
about acceptance
behaviour Select
worker that has highest
expectation of successful outcome
11. Multi-armed Bandit
Which arm should be played to maximize reward?
4 July 201611
antigavin@flickr
“Bandit problems embody in
essential form a conflict evident in all
human action: choosing actions
which yield immediate reward vs.
choosing actions (e.g. acquiring
information or preparing the ground)
whose benefit will come only later.”
— P. Whittle (1980)
12. Multi-armed Bandit
Application of multi-armed bandit model
4 July 201612
Clinical
Trials
Ad
Placement
Adaptive
Routing
Stock
Investment
13. Multi-armed Bandit
Assignment under uncertainty (R3)
Workers may accept or reject an assigned task
4 July 201613
Exploration
Exploitation
Trade-off
antigavin@flickr
What if we knew that reward of
each machine is dependent on
the time of day
(Contextual Bandit)
14. SpatialUCB
Multi-armed bandit approach to task assignment
Optimistic assignment under uncertainty
Exploits relationship between spatial context and task
acceptance
4 July 201614
Start
Wait for new task
Observe spatial
features of task
and all workers
Calculate expectation of
success for all workers
Assign worker
with highest
expectation
Observe
outcome for the
assignment
Use ridge regression to update model
coefficients for assigned worker
15. Gowalla Dataset
Location based Social Network
User = Worker
Check-in Spot = Spatial Task
Highlight Spot = Non Spatial Task
Characteristics of selected 90 workers
4 July 201615
Users 9,183
Spots 30,367
Highlights 2,767
Check-ins 357,753
16. Experiment 1
Simulate a worker as Binomial stochastic process
Simulate 90 workers and 5,000 tasks
Probability of success based on Gowalla dataset
ASR (Assignment Success Rate)
Context free algorithms
4 July 201616
17. Experiment 2
Assignment with spatial context
ASR (Assignment Success Rate)
ATD (Average Travel Distance)
4 July 201617
5k tuning tasks 26k testing tasks
18. Conclusion
Multi-armed bandit is an appropriate tool for modelling the
task assignment problem in spatial crowdsourcing
SpatialUCB performs better for learning task acceptance
behaviour based on the spatial contextual information
We plan to extend SpatialUCB to combinatorial
assignments
4 July 201618
19. Thank You
Umair ul Hassan
umair.ulhassan@insight-centre.org
www.insight-centre.org
Notas del editor
Introduce myself
My institute
My university
My paper
Summarize the agenda (1-2mins)
Task are associated with locations
Worker are also associate with locations
A new task requires worker to travel to that particular location to perform that task
Some existing platform use monetary rewards for SC
A success story was volunteer based SC during the Haiti earthquake
This overall process is common among all forms of crowdsourcing.
In SC physical aspect are more important
Pull method is mostly popular with the existing platform but it can have Search Friction and Starvation issues
The assumptions are based on the push method of task assignment with online setting for task arrivals
Description is a textual attribute that lists the instructions to be followed for correctly performing the task.
Location attribute defines the coordinates of the location associate with the task.
Expiry attribute is a time-stamp defining the deadline for task completion, after which the task becomes invalid.
Type attribute indicates the type of task.
History attribute is a vector that stores the number of tasks assigned to the worker and the number of tasks accepted by the worker.
Model is a set of the vectors that stores the variables specific to the task acceptance behavior of the worker.
Location attribute stores the last reported location of the worker.
The offline problem can be modelled through dynamic programming
We can solve the problem with branch-bound methods or cutting planes methods
Existing research has worker on the online task assignment problem
But either it is focused on virtual tasks or the objective is coverage in terms of assignment (not the assignment success)
Our framework assumes dynamic task arrival and stochastic acceptance behavior
The primary source of uncertainty is the task acceptance
It posses a fundamental trade-off
We aim to address this trade-off
antigavin@flickr
The multi-armed bandit problem models an agent that simultaneously attempts to acquire new knowledge and optimize his or her decisions based on existing knowledge.
It assume the exploration-exploitation trade-off
Similar trade-offs is other domains
sanofi-pasteur@flickr (clinical trials investigating the effects of different experimental treatments while minimizing patient losses)
Wikipedia (adaptive routing efforts for minimizing delays in a network)
eelssej_@flickr (ad placement in web based advertisements to maximize revenue and web search)
yahoo_presse@flickr (which stock option gives the highest return, under time-varying return profiles)
MAB has many variants
One of them uses expert advice or side information (also called context)
MAB assumes uncertain rewards but immediate observability
Individualized linear mode for each worker
Ridge regression can also be interpreted as a Bayesian point estimate, where the posterior distribution of the coefficient vector, denoted as p(beta), is Gaussian
with mean <beta> and covariance <A> . Given the current model, the predictive variance of the expected payoff x,beta is evaluated as first term and the second term becomes the standard deviation
The criterion for arm selection in Eq. (5) can also be regarded as an additive trade-off between the payoff estimate and model uncertainty reduction.
Leyla Kazemi and Cyrus Shahabi. Geocrowd: enabling query answering with spatial crowdsourcing. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems, pages 189–198. ACM, 2012
Hien To, Gabriel Ghinita, and Cyrus Shahabi. A framework for protecting worker location privacy in spatial crowdsourcing. Proceedings of the VLDB Endowment, 7(10), 2014.
Discuss the features on the dataset from paper
Figure 4a shows the distribution of the number of check-ins by each user on a logarithmic scale.
The distribution shows the Zipf’s law behavior for the number of unique check-ins by a user; the majority of users have very low activity.
This behavior is commonly observed in various physical and social phenomena.
We excluded the long tail of low activity users by selecting top ranked users based on their check-ins and highlights. T
he resulting dataset had 90 users with relatively high levels of activity.
The distribution of check-ins, for selected users, is shown in Figure 4b.
Clearly, the check-in behavior varies across users.
Some users have higher number of check-ins with in 5-10 kilometers, while other users have visited spots as far as 25 kilometers away.
Conversely, there are users who visit very small number of spot irrespective of the distance.
The average distance shows a negative correlation with the number of check-ins.
Overall this behavior is representative of worker dynamics in spatial crowdsourcing; more tasks tend to be completed in the near vicinity of workers
We compare standard MAB algorithms to narrow down best parameter values
We used 5k tasks to further fine tune the algorithms
18-25% improvement against best non-contextual algorithm (in terms of task acceptance)
Add some exact figures or numbers here
Generalized Linear Models (Linear, Probit, Logit)
We were looking to evaluate the effect of adding spatial context
Paid research internships