Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Machine Learning and Robotics

  • Sé el primero en comentar

Machine Learning and Robotics

  1. 1. MACHINE LEARNING AND ROBOTICS Lisa Lyons 10/22/08
  2. 2. OUTLINE <ul><li>Machine Learning Basics and Terminology </li></ul><ul><li>An Example: DARPA Grand/Urban Challenge </li></ul><ul><li>Multi-Agent Systems </li></ul><ul><li>Netflix Challenge (if time permits) </li></ul>
  3. 3. INTRODUCTION <ul><li>Machine learning is commonly associated with robotics </li></ul><ul><li>When some think of robots, they think of machines like WALL-E (right) – human-looking, has feelings, capable of complex tasks </li></ul><ul><li>Goals for machine learning in robotics aren’t usually this advanced, but some think we’re getting there </li></ul><ul><li>Next three slides outline some goals that motivate researchers to continue work in this area </li></ul>
  4. 4. HOUSEHOLD ROBOT TO ASSIST HANDICAPPED <ul><li>Could come preprogrammed with general procedures and behaviors </li></ul><ul><li>Needs to be able to learn to recognize objects and obstacles and maybe even its owner (face recognition?) </li></ul><ul><li>Also needs to be able to manipulate objects without breaking them </li></ul><ul><li>May not always have all information about its environment (poor lighting, obscured objects) </li></ul>
  5. 5. FLEXIBLE MANUFACTURING ROBOT <ul><li>Configurable robot that could manufacture multiple items </li></ul><ul><li>Must learn to manipulate new types of parts without damaging them </li></ul>
  6. 6. LEARNING SPOKEN DIALOG SYSTEM FOR REPAIRS <ul><li>Given some initial information about a system, a robot could converse with a human and help to repair it </li></ul><ul><li>Speech understanding is a very hard problem in itself </li></ul>
  7. 7. MACHINE LEARNING BASICS AND TERMINOLOGY <ul><li>With applications and examples in robotics </li></ul>
  8. 8. LEARNING ASSOCIATIONS <ul><li>Association Rule – probability that an event will happen given another event already has (P(Y|X)) </li></ul>
  9. 9. CLASSIFICATION <ul><li>Classification – model where input is assigned to a class based on some data </li></ul><ul><li>Prediction – assuming a future scenario is similar to a past one, using past data to decide what this scenario would look like </li></ul><ul><li>Pattern Recognition – a method used to make predictions </li></ul><ul><ul><li>Face Recognition </li></ul></ul><ul><ul><li>Speech Recognition </li></ul></ul><ul><li>Knowledge Extraction – learning a rule from data </li></ul><ul><li>Outlier Detection – finding exceptions to the rules </li></ul>
  10. 10. REGRESSION <ul><li>Linear regression is an example </li></ul><ul><li>Both Classification and Regression are “Supervised Learning” strategies where the goal is to find a mapping from input to output </li></ul><ul><li>Example: Navigation of autonomous car </li></ul><ul><ul><li>Training Data: actions of human drivers in various situations </li></ul></ul><ul><ul><li>Input: data from sensors (like GPS or video) </li></ul></ul><ul><ul><li>Output: angle to turn steering wheel </li></ul></ul>
  11. 11. UNSUPERVISED LEARNING <ul><li>Only have input </li></ul><ul><li>Want to find regularities in the input </li></ul><ul><li>Density Estimation: finding patterns in the input space </li></ul><ul><ul><li>Clustering: find groupings in the input </li></ul></ul>
  12. 12. REINFORCEMENT LEARNING <ul><li>Policy: generating correct actions to reach the goal </li></ul><ul><li>Learn from past good policies </li></ul><ul><li>Example: robot navigating unknown environment in search of a goal </li></ul><ul><ul><li>Some data may be missing </li></ul></ul><ul><ul><li>May be multiple agents in the system </li></ul></ul>
  13. 13. POSSIBLE APPLICATIONS <ul><li>Exploring a world </li></ul><ul><li>Learning object properties </li></ul><ul><li>Learning to interact with the world and with objects </li></ul><ul><li>Optimizing actions </li></ul><ul><li>Recognizing states in world model </li></ul><ul><li>Monitoring actions to ensure correctness </li></ul><ul><li>Recognizing and repairing errors </li></ul><ul><li>Planning </li></ul><ul><li>Learning action rules </li></ul><ul><li>Deciding actions based on tasks </li></ul>
  14. 14. WHAT WE EXPECT ROBOTS TO DO <ul><li>Be able to react promptly and correctly to changes in environment or internal state </li></ul><ul><li>Work in situations where information about the environment is imperfect or incomplete </li></ul><ul><li>Learn through their experience and human guidance </li></ul><ul><li>Respond quickly to human interaction </li></ul><ul><li>Unfortunately, these are very high expectations which don’t always correlate very well with machine learning techniques </li></ul>
  15. 15. DIFFERENCES BETWEEN OTHER TYPES OF MACHINE LEARNING AND ROBOTICS <ul><li>Planning can frequently be done offline </li></ul><ul><li>Actions usually deterministic </li></ul><ul><li>No major time constraints </li></ul><ul><li>Often require simultaneous planning and execution (online) </li></ul><ul><li>Actions could be nondeterministic depending on data (or lack thereof) </li></ul><ul><li>Real-time often required </li></ul><ul><li>Other ML Applications </li></ul><ul><li>Robotics </li></ul>
  16. 16. AN EXAMPLE: DARPA GRAND/URBAN CHALLENGE
  17. 17. THE CHALLENGE <ul><li>Defense Advanced Research Projects Agency (DARPA) </li></ul><ul><li>Goal: to build a vehicle capable of traversing unrehearsed off-road terrain </li></ul><ul><li>Started in 2003 </li></ul><ul><li>142 mile course through Mojave </li></ul><ul><li>No one made it through more than 5% of the course in 2004 race </li></ul><ul><li>In 2005, 195 teams registered, 23 teams raced, 5 teams finished </li></ul>
  18. 18. THE RULES <ul><li>Must traverse a desert course up to 175 miles long in under 10 h </li></ul><ul><li>Course kept secret until 2h before the race </li></ul><ul><li>Must follow speed limits for specific areas of the course to protect infrastructure and ecology </li></ul><ul><li>If a faster vehicle needs to overtake a slower one, the slower one is paused so that vehicles don’t have to handle dynamic passing </li></ul><ul><li>Teams given data on the course 2h before race so that no global path planning was required </li></ul>
  19. 19. A DARPA GRAND CHALLENGE VEHICLE CRASHING
  20. 20. A DARPA GRAND CHALLENGE VEHICLE THAT DID NOT CRASH <ul><li>… namely Stanley, the winner of the 2005 challenge </li></ul>
  21. 21. TERRAIN MAPPING AND OBSTACLE DETECTION <ul><li>Data from 5 laser scanners mounted on top of the car is used to generate a point cloud of what’s in front of the car </li></ul><ul><li>Classification problem </li></ul><ul><ul><li>Drivable </li></ul></ul><ul><ul><li>Occupied </li></ul></ul><ul><ul><li>Unknown </li></ul></ul><ul><li>Area in front of vehicle as grid </li></ul><ul><li>Stanley’s system finds the probability that ∆h > δ where ∆h is the observed height of the terrain in a certain cell </li></ul><ul><li>If this probability is higher than some threshold α , the system defines the cell as occupied </li></ul>
  22. 22. (CONT.) <ul><li>A discriminative learning algorithm is used to tune the parameters </li></ul><ul><li>Data is taken as a human driver drives through a mapped terrain avoiding obstacles (supervised learning) </li></ul><ul><li>Algorithm uses coordinate ascent to determine δ and α </li></ul>
  23. 23. COMPUTER VISION ASPECT <ul><li>Lasers only make it safe for car to drive < 25 mph </li></ul><ul><li>Needs to go faster to satisfy time constraint </li></ul><ul><li>Color camera is used for long-range obstacle detection </li></ul><ul><li>Still the same classification problem </li></ul><ul><li>Now there are more factors to consider – lighting, material, dust on lens </li></ul><ul><li>Stanley takes adaptive approach </li></ul>
  24. 24. VISION ALGORITHM <ul><li>Take out the sky </li></ul><ul><li>Map a quadrilateral on camera video corresponding with laser sensor boundaries </li></ul><ul><li>As long as this region is deemed drivable, use the pixels in the quad as a training set for the concept of drivable surface </li></ul><ul><li>Maintain Gaussians that model the color of drivable terrain </li></ul><ul><li>Adapt by adjusting previous Gaussians and/or throwing them out and adding new ones </li></ul><ul><ul><li>Adjustment allows for slow adjustment to lighting conditions </li></ul></ul><ul><ul><li>Replacement allows for rapid change in color of the road </li></ul></ul><ul><li>Label regions as drivable if their pixel values are near one or more of the Gaussians and they are connected to laser quadrilateral </li></ul>
  25. 26. ROAD BOUNDARIES <ul><li>Best way to avoid obstacles on a desert road is to find road boundaries and drive down the middle </li></ul><ul><li>Uses low-pass one-dimensional Kalman Filters to determine road boundary on both sides of vehicle </li></ul><ul><li>Small obstacles don’t really affect the boundary found </li></ul><ul><li>Large obstacles over time have a stronger effect </li></ul>
  26. 27. SLOPE AND RUGGEDNESS <ul><li>If terrain becomes too rugged or steep, vehicle must slow down to maintain control </li></ul><ul><li>Slope is found from vehicle’s pitch estimate </li></ul><ul><li>Ruggedness is determined by taking data from vehicle’s z accelerometer with gravity and vehicle vibration filtered out </li></ul>
  27. 28. PATH PLANNING <ul><li>No global planning necessary </li></ul><ul><li>Coordinate system used is base trajectory + lateral offset </li></ul><ul><li>Base trajectory is smoothed version of driving corridor on the map given to contestants before the race </li></ul>
  28. 29. PATH SMOOTHING <ul><li>Base trajectory computed in 4 steps: </li></ul><ul><ul><li>Points are added to the map in proportion to local curvature </li></ul></ul><ul><ul><li>Least-squares optimization is used to adjust trajectories for smoothing </li></ul></ul><ul><ul><li>Cubic spline interpolation is used to find a path that can be resampled efficiently </li></ul></ul><ul><ul><li>Calculate the speed limit </li></ul></ul>
  29. 30. ONLINE PATH PLANNING <ul><li>Determines the actual trajectory of vehicle during race </li></ul><ul><li>Search algorithm that minimizes a linear combination of continuous cost functions </li></ul><ul><li>Subject to dynamic and kinematic constraints </li></ul><ul><ul><li>Max lateral acceleration </li></ul></ul><ul><ul><li>Max steering angle </li></ul></ul><ul><ul><li>Max steering rate </li></ul></ul><ul><ul><li>Max acceleration </li></ul></ul><ul><li>Penalize hitting obstacles, leaving corridor, leaving center of road </li></ul>
  30. 32. MULTI-AGENT SYSTEMS
  31. 33. RECURSIVE MODELING METHOD (RMM) <ul><li>Agents model the belief states of other agents </li></ul><ul><li>Beyesian methods implemented </li></ul><ul><li>Useful in homogeneous non-communicating Multi-Agent Systems (MAS) </li></ul><ul><li>Has to be cut off at some point (don’t want a situations where agent A thinks that agent B thinks that agent A thinks that…) </li></ul><ul><li>Agents can affect other agents by affecting the environment to produce a desired reaction </li></ul>
  32. 34. HETEROGENEOUS NON-COMMUNICATING MAS <ul><li>Competitive and cooperative learning possible </li></ul><ul><li>Competitive learning more difficult because agents may end up in “arms race” </li></ul><ul><li>Credit-assignment problem </li></ul><ul><ul><li>Can’t tell if agent benefitted because it’s actions were good or if opponent’s actions were bad </li></ul></ul><ul><li>Experts and observers have proven useful </li></ul><ul><li>Different agents may be given different roles to reach the goal </li></ul><ul><ul><li>Supervised learning to “teach” each agent how to do its part </li></ul></ul>
  33. 35. COMMUNICATION <ul><li>Allowing agents to communicate can lead to deeper levels of planning since agents know (or think they know) the beliefs of others </li></ul><ul><li>Could allow one agent to “train” another to follow it’s actions using reinforcement learning </li></ul><ul><li>Negotiations </li></ul><ul><li>Commitment </li></ul><ul><li>Autonomous robots could understand their position in an environment by querying other robots for their believed positions and making a guess based on that (Markov localization, SLAM) </li></ul>
  34. 36. NETFLIX CHALLENGE <ul><li>(if time permits) </li></ul>
  35. 37. REFERENCES <ul><li>Alpaydin, E. Introduction to Machine Learning . Cambridge, Mass. : MIT Press, 2004. </li></ul><ul><li>Kreuziger, J. “Application of Machine Learning to Robotics – An Analysis.” In Proceedings of the Second International Conference on Automation, Robotics, and Computer Vision (ICARCV '92). 1992. </li></ul><ul><li>Mitchell et. al. “Machine Learning .” Annu. Rev. Coput. Sci. 1990. 4:417-33. </li></ul><ul><li>Stone, P and Veloso, M. “Multiagent Systems: A Survey from a Machine Learning Perspective.” Autonomous Robots 8, 345-383, 2000. </li></ul><ul><li>Thrun et. al. “Stanley: The Robot that Won the DARPA Grand Challenge.” Journal of Field Robotics 23(9), 661-692, 2006. </li></ul>

    Sé el primero en comentar

    Inicia sesión para ver los comentarios

  • AbhilashNaik6

    Mar. 14, 2018
  • shanthiniRavivarma

    Feb. 19, 2020

Vistas

Total de vistas

2.626

En Slideshare

0

De embebidos

0

Número de embebidos

2

Acciones

Descargas

197

Compartidos

0

Comentarios

0

Me gusta

2

×