social pharmacy d-pharm 1st year by Pragati K. Mahajan
Combining Motion Planning and Optimization for Flexible Robot Manipulation
1. Combining Motion Planning and Optimization
for Flexible Robot Manipulation
Jonathan Scholz and Mike Stilman
International Conference on Humanoid Robotics, 2010
COMP 790-099, Presenter: Ravikiran Janardhana
1
2. Problem Statement
• Design a system/algorithm to solve general
manipulation tasks in natural human
environments
• Involves uncertain dynamics and
underspecified goals
• Service Manipulation Tasks – House Cleaning
to Collaborative Factory Automation
2
3. Service Robots
• Challenges – Unfamiliar Objects and Abstract
Goals
• Learn about objects in addition to planning
interactions
• Accept broad variety of goals
Eg:- Setting a table
3
4. Related Work
• Probabilistic Roadmaps, Rapidly Exploring
Random Trees
• Model-free Reinforcement Learning
• Model-based learners i.e., learning from
demonstration
4
5. Proposed Solution
• Task space based probabilistic planner
• Combine strengths of model based planning
and reinforcement learning i.e., model-based
planning with optimization
• Reaching an optimal world configuration is
more important than finding the optimal way
to reach it
5
6. Flexible Manipulation
• Determining the goal or the optimal
configuration
• Finding the forward models for robot actions
• Planning to use the actions to reach the goal
6
7. Service Task: Setting a Table
• Consider a dinner where n guests must be
given n plates and m platters must be placed
at the center of the table
7
8. Objective Function Specification
• User can specify the goal as an abstract
optimization metric
• Following are the objectives:-
– The plates should be located far from each other
– The platters should be at the center of the table
– The platters should be aligned parallel to the table
8
9. Objective Function Specification
• Define two sets of objects: plates P and
platters Q
• Each object location is parameterized by
position and orientation {x, y, θ}
• Environmental constraints – Table Dimensions
xmin ≤ x ≤ xmax; ymin ≤ y ≤ ymax;
9
10. Objective Function - Math
• Maximize Plate distance
• Put Platters at Table Center
• Align Platters with Table
10
11. Objective Function - Math
• Overall objective function:
• The weights α, β, γ must be specified with
regard to the relative importance of the
subtasks.
11
12. Action Model Learning
• Given state space S and actions A, probability
of outcome of any action in any state is
• Probability distribution obtained by
exploration.
• Compute probability models of displacement,
12
22. Conclusion / Future Work
• The paper presents a general framework for
handling abstract tasks in object manipulation
using reinforcement learning and model based
planning
• Explore broader tools and domains that
increase the generality of task space planning
by combining planning, learning and
optimization
22
23. Comments
• Requires tuning of parameters such as σ2ref and ɛ
which are highly task dependent
• Models can be stored for future use
• Collision detection would be complex if problem
size was increased, RRT might then become
deadlocked and algorithm is reduced to random
search
23
Briefly explain meaning of uncertain dynamics and underspecified goals.
Stress on the need for learning about objects in addition to planning. Give some more examples for broad variety of goals.
Explain the caution w.r.t PRM and RRTs. Explain reinforcement learning and talk about the limitations of the related work.
Setup the next slide so that listeners can understand what is task space. Explain as to what do you mean by strengths of reinforcement learning and model based planning. Define optimal world configuration.
Explain forward models briefly
Explain table setting problem
Explain the equations, (xg,yg) and the last equation in detail
Touch on Markov Decision Process and on how probability distribution and probability models of displacement is computed.
Explain the bounding box and displacement of 5cm vector displacement.
It is RRT effectively with 2 modifications namely:- a) Select state action pair which results in node closest to a sample point b) Direct GD heuristic to reach global minimum faster. Explain significance of “epsilon” and distance metric “rho”
SRLib block and cylinder primitives used.
Talk about the general framework presented and how it can be used to solve a variety of manipulation tasks
Talk about empirically determined (determined by experiments or observation) parameters and how models can be re-used and shared.