Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

•

1 recomendación•484 vistas

This document proposes a method called Trajectory-wise Multiple Choice Learning (MCL) to improve generalization in model-based reinforcement learning. The method uses a multi-headed dynamics model to approximate the multi-modal distribution of transition dynamics. Trajectory-wise MCL updates the prediction head that is most accurate over an entire trajectory segment, allowing each head to specialize. An adaptive planning method then uses the most accurate head based on recent experience. Evaluation shows the approach achieves superior generalization to new environments compared to baseline methods.

Ingeniería

Trajectory-wise Multiple Choice Learning for
Dynamics Generalization in Reinforcement Learning
Younggyo Seo1
*, Kimin Lee2
*, Ignasi Clavera2
, Thanard Kurutach2
,
Jinwoo Shin1
and Pieter Abbeel2
KAIST1
, UC Berkeley2
*Equal Contribution
https://sites.google.com/view/trajectory-mcl

Problem: Dynamics Generalization
● Model-based RL suffers from dynamics generalization problem
Evaluation
Training
Deployment

Problem: Dynamics Generalization
● Multi-modal distribution of transition dynamics

Main Components
● Main idea: explicitly approximate the multi-modal distribution
● Multi-headed dynamics model
Approximates multi-modal distribution by
learning specialized prediction heads

Trajectory-wise Multiple Choice Learning
● For MCL, each prediction head should receive distinct training samples
Transitions
Which prediction head is most
accurate over these transitions?

Trajectory-wise Multiple Choice Learning
● For MCL, each prediction head should receive distinct training samples
Trajectory
segment
● Trajectory-wise multiple choice learning
Difference in dynamics is more distinctively captured
by considering prediction error over trajectory
segment

Context-conditional Multi-headed Dynamics Model
● We also introduce context encoder for online adaptation to unseen environments
● Context encoder g captures
contextual information from past
experience
● See [Lee’20] for more information
[Lee’20] Lee, Kimin, Younggyo Seo, Seunghyun Lee, Honglak Lee, Jinwoo Shin. "Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning." In
ICML. 2020.

Analysis on Trajectory-wise MCL
Transitions Trajectory
segment
● Specialization leads to superior generalization performance
Hopper

Analysis on Adaptive Planning
● Qualitative analysis
○ Manually assign prediction heads specialized for [mass: 2.5] to [mass: 1.0]
[Mass: 1.0]
with prediction heads
specialized for [Mass: 2.5]
[Mass: 2.5]
with prediction heads
specialized for [Mass: 2.5]
Agent acts as if it has a heavyweight body!

Comparative Evaluation
● Superior generalization performance on unseen 6 environments

Conclusion
● For dynamics generalization
○ Context-conditional multi-headed dynamics model
○ Trajectory-wise multiple choice learning
○ Adaptive planning
Thank you!

Más contenido relacionado

Similar a Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...

Aalto University

Continual learning (CL) aims to learn a sequence of tasks without forgetting the previously acquired knowledge. However, recent CL advances are restricted to supervised continual learning (SCL) scenarios. Consequently, they are not scalable to real-world applications where the data distribution is often biased and unannotated. In this work, we focus on unsupervised continual learning (UCL), where we learn the feature representations on an unlabelled sequence of tasks and show that reliance on annotated data is not necessary for continual learning. We conduct a systematic study analyzing the learned feature representations and show that unsupervised visual representations are surprisingly more robust to catastrophic forgetting, consistently achieve better performance, and generalize better to out-of-distribution tasks than SCL. Furthermore, we find that UCL achieves a smoother loss landscape through qualitative analysis of the learned representations and learns meaningful feature representations. Additionally, we propose Lifelong Unsupervised Mixup (Lump), a simple yet effective technique that interpolates between the current task and previous tasks' instances to alleviate catastrophic forgetting for unsupervised representations.

Representational Continuity for Unsupervised Continual Learning

MLAI2

Representation learning algorithms are designed to learn abstract features that characterize data. State representation learning (SRL) focuses on a particular kind of representation learning where learned features are in low dimension, evolve through time, and are influenced by actions of an agent. As the representation learned captures the variation in the environment generated by agents, this kind of representation is particularly suitable for robotics and control scenarios. In particular, the low dimension helps to overcome the curse of dimensionality, provides easier interpretation and utilization by humans and can help improve performance and speed in policy learning algorithms such as reinforcement learning. This survey aims at covering the state-of-the-art on state representation learning in the most recent years. It reviews different SRL methods that involve interaction with the environment, their implementations and their applications in robotics control tasks (simulated or real). In particular, it highlights how generic learning objectives are differently exploited in the reviewed algorithms. Finally, it discusses evaluation methods to assess the representation learned and summarizes current and future lines of research.

State Representation Learning for control: an overview

Natalia Díaz Rodríguez

Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...

AutonomyIncubator

Creating a comprehensive discussion on artificial intelligence (AI) that spans 3000 words would cover a vast array of topics, including its history, development, applications, ethical implications, and future prospects. To give you an idea of what such an extensive essay might entail, here's an outline highlighting key points that could be explored in each section: 1. **Introduction to Artificial Intelligence (500 words)** - Definition of AI. - Brief history and evolution of AI. - Key milestones in AI development. 2. **Fundamental Concepts and Technologies in AI (500 words)** - Machine Learning and Deep Learning. - Neural Networks. - Natural Language Processing (NLP). - Computer Vision. - Robotics and Automation. 3. **Applications of AI Across Various Sectors (500 words)** - AI in Healthcare: diagnostics, treatment planning, drug discovery. - AI in Business: customer service, data analysis, automation. - AI in Transportation: autonomous vehicles, traffic management. - AI in Education: personalized learning, grading systems. - AI in Entertainment: gaming, content creation. 4. **Ethical Considerations and Challenges in AI (500 words)** - Bias and fairness in AI algorithms. - Privacy concerns with AI technologies. - AI and job displacement. - Ethical AI development and use. 5. **AI's Global Impact and Policy Implications (500 words)** - AI's impact on global economies. - International regulations and policies on AI. - AI in global governance and security. 6. **The Future of AI and Emerging Trends (500 words)** - Advancements in AI technologies. - Potential future applications and innovations. - The role of AI in shaping future societies. In a detailed essay, each of these sections would delve into specific examples, case studies, and theoretical frameworks, providing a comprehensive understanding of AI. The essay would not only inform about the current state of AI but also provoke thought about its future implications and how society might adapt to and shape these emerging technologies.Creating a comprehensive discussion on artificial intelligence (AI) that spans 3000 words would cover a vast array of topics, including its history, development, applications, ethical implications, and future prospects. To give you an idea of what such an extensive essay might entail, here's an outline highlighting key points that could be explored in each section: 1. **Introduction to Artificial Intelligence (500 words)** - Definition of AI. - Brief history and evolution of AI. - Key milestones in AI development. 2. **Fundamental Concepts and Technologies in AI (500 words)** - Machine Learning and Deep Learning. - Neural Networks. - Natural Language Processing (NLP). Creating a comprehensive discussion on artificial intelligence (AI) that spans 3000 words would cover a vast array of topics, including its history, development, applications, ethical implications, and futUETR

AI BASED PPT FOR PROJCTS USEFUL FOR EDITING

Lokesh147875

Agile Kolkata 2023 I Deep Learning for Sustainable Energy: A journey - Dr Sap...

AgileNetwork

Graph convolutional neural networks for web-scale recommender systems.pptx

ssuser2624f71

Reinforcement learning in today's order of things when artificial intelligence is on the rise is a favorable field for new research. One of the problems that were tried to be solved in the last year or two is the problem of environmental control or navigation. This talk is going to present one form of solution to the problem of navigation and generalization in a three-dimensional environment while there are restrictions of rewards, by forming an autonomous agent with deep learning techniques.

Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...

SmartCat

Similar a Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning (8)

Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...

Representational Continuity for Unsupervised Continual Learning

State Representation Learning for control: an overview

Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...

AI BASED PPT FOR PROJCTS USEFUL FOR EDITING

Agile Kolkata 2023 I Deep Learning for Sustainable Energy: A journey - Dr Sap...

Graph convolutional neural networks for web-scale recommender systems.pptx

Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...

Más de ALINLAB

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...

ALINLAB

Learning bounds for risk-sensitive learning

ALINLAB

CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...

ALINLAB

Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)

ALINLAB

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...

ALINLAB

Self-supervised Label Augmentation via Input Transformations (ICML 2020)

ALINLAB

M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)

ALINLAB

Más de ALINLAB (7)

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...

Learning bounds for risk-sensitive learning

CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...

Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...

Self-supervised Label Augmentation via Input Transformations (ICML 2020)

M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)

Último

STEAM NOZZLES AND TURBINES Flow of steam through nozzles, shapes of nozzles, effect of friction, critical pressure ratio, supersaturated flow - impulse and reaction principles, velocity diagram, work done and efficiency – types of compounding - governors. AIR COMPRESSORS Classification - working principle - type of compressors, work of compression with and without clearance - volumetric efficiency - isothermal and isentropic efficiency of reciprocating compressors - multistage air compressor with inter cooling.

Thermal Engineering -unit - III & IV.ppt

DineshKumar4165

Extrusion Processes and Their Limitations

120cr0395

Increased aeration of the soil; Stabilized soil structure; Higher and more diversified crop production; Better workability of the land; Earlier planting dates; Reduction of peak discharges by an increased temporary storage of water in the soil decomposition of organic matter; soil subsidence; reduced irrigation efficiency; increased risk of drought. excessive leaching of valuable nutrients from the soil; downstream environmental damage by salty or otherwise polluted drainage water; the presence of ditches, canals, and structures impending accessibility and interfering with other infrastructural elements of the land.

chapter 5.pptx: drainage and irrigation engineering

mulugeta48

UNIT-II FMM-Flow Through Circular Conduits

rknatarajan

Java Programming :Event Handling(Types of Events)

simmis5

Model Call Girl Services in Delhi reach out to us at 🔝 9953056974 🔝✔️✔️ Our agency presents a selection of young, charming call girls available for bookings at Oyo Hotels. Experience high-class escort services at pocket-friendly rates, with our female escorts exuding both beauty and a delightful personality, ready to meet your desires. Whether it's Housewives, College girls, Russian girls, Muslim girls, or any other preference, we offer a diverse range of options to cater to your tastes. We provide both in-call and out-call services for your convenience. Our in-call location in Delhi ensures cleanliness, hygiene, and 100% safety, while our out-call services offer doorstep delivery for added ease. We value your time and money, hence we kindly request pic collectors, time-passers, and bargain hunters to refrain from contacting us. Our services feature various packages at competitive rates: One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Full night for more than 1 person: Contact us at 🔝 9953056974 🔝. for details Operating 24/7, we serve various locations in Delhi, including Green Park, Lajpat Nagar, Saket, and Hauz Khas near metro stations. For premium call girl services in Delhi 🔝 9953056974 🔝. Thank you for considering us!

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...

9953056974 Low Rate Call Girls In Saket, Delhi NCR

Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking Booking Now open +91- 7737669865 Why you Choose Us- +91- 7737669865 HOT⇄ 7737669865 Mr ashu ji Call Mr ashu Ji +91- 7737669865 (V020524]N) 𝐇𝐨𝐭𝐞𝐥 𝐑𝐨𝐨𝐦𝐬 𝐈𝐧𝐜𝐥𝐮𝐝𝐢𝐧𝐠 𝐑𝐚𝐭𝐞 𝐒𝐡𝐨𝐭𝐬/𝐇𝐨𝐮𝐫𝐲🆓 .█▬█⓿▀█▀ 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 𝐆𝐈𝐑𝐋 𝐕𝐈𝐏 𝐄𝐒𝐂𝐎𝐑𝐓 Hello Guys ! High Profiles young Beauties and Good Looking standard Profiles Available , Enquire Now if you are interested in Hifi Service and want to get connect with someone who can understand your needs. Service offers you the most beautiful High Profile sexy independent female Escorts in genuine ✔✔✔ To enjoy with hot and sexy girls ✔✔✔ ★providing:- • Models • vip Models • Russian Models

Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking

roncy bisnoi

International Journal of Intelligent Systems and Applications in Engineering (IJISAE) is an international and interdisciplinary journal for both invited and contributed peer reviewed articles that intelligent systems and applications in engineering at all levels. The journal publishes a broad range of papers covering theory and practice in order to facilitate future efforts of individuals and groups involved in the field. IJISAE, a peer-reviewed double-blind refereed journal, publishes original papers featuring innovative and practical technologies related to the design and development of intelligent systems in engineering. Its coverage also includes papers on intelligent systems applications in areas such as nanotechnology, renewable energy, medicine engineering, Aeronautics and Astronautics, mechatronics, industrial manufacturing, bioengineering, agriculture, services, intelligence based automation and appliances, medical robots and robotic rehabilitations, space exploration and etc.

Call for Papers - International Journal of Intelligent Systems and Applicatio...

Christo Ananth

KubeKraft presentation @CloudNativeHooghly

sanyuktamishra911

Coefficient of Thermal Expansion and their Importance.pptx

Asutosh Ranjan

Call Girl Aurangabad Indira Call Now: 8617697112 Aurangabad Escorts Booking Contact Details WhatsApp Chat: +91-8617697112 Aurangabad Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts Aurangabad understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide –

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7

Call Girls in Nagpur High Profile Call Girls

PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL

ManishPatel169454

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Indian Girls Waiting For You To Fuck Booking Contact Details WhatsApp Chat: +91-6297143586 pune Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts pune understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide - 01-may-2024(v.n)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Call Girls in Nagpur High Profile

Call Girl Meerut Indira Call Now: 8617697112 Meerut Escorts Booking Contact Details WhatsApp Chat: +91-8617697112 Meerut Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts Meerut understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide –

(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7

Call Girls in Nagpur High Profile Call Girls

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working

rknatarajan

Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Girls Waiting For You To Fuck Booking Contact Details WhatsApp Chat: +91-6297143586 pune Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts pune understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide - 01-may-2024(v.n)

Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...

Call Girls in Nagpur High Profile

Intze Overhead Water Tank Design by Working Stress - IS Method.pdf

Suman Jyoti

Online banking management system project.pdf

Kamal Acharya

AKTU Computer Networks notes --- Unit 3.pdf

ankushspencer015

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Chance Of Getting Into My Sexy Boobs? Booking Contact Details WhatsApp Chat: +91-8250192130 pune Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus they look fabulously elegant; making an impressionable. Independent Escorts pune understands the value of confidentiality and discretion - they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide - 30-april-2024(v.n)

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

ranjana rawat

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

1. Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning Younggyo Seo1 *, Kimin Lee2 *, Ignasi Clavera2 , Thanard Kurutach2 , Jinwoo Shin1 and Pieter Abbeel2 KAIST1 , UC Berkeley2 *Equal Contribution https://sites.google.com/view/trajectory-mcl

2. Problem: Dynamics Generalization ● Model-based RL suffers from dynamics generalization problem Evaluation Training Deployment

3. Problem: Dynamics Generalization ● Multi-modal distribution of transition dynamics

4. Main Components ● Main idea: explicitly approximate the multi-modal distribution ● Multi-headed dynamics model Approximates multi-modal distribution by learning specialized prediction heads

5. Main Components ● Main idea: explicitly approximate the multi-modal distribution ● Multi-headed dynamics model Approximates multi-modal distribution by learning specialized prediction heads ● Multiple choice learning (MCL) Update the most accurate prediction head for specialization

6. Main Components ● Main idea: explicitly approximate the multi-modal distribution ● Multi-headed dynamics model Approximates multi-modal distribution by learning specialized prediction heads ● Multiple choice learning (MCL) Update the most accurate prediction head for specialization ● Adaptive planning Use the most accurate prediction head over a recent experience for planning

7. Trajectory-wise Multiple Choice Learning ● For MCL, each prediction head should receive distinct training samples Transitions Which prediction head is most accurate over these transitions?

8. Trajectory-wise Multiple Choice Learning ● For MCL, each prediction head should receive distinct training samples Trajectory segment ● Trajectory-wise multiple choice learning Difference in dynamics is more distinctively captured by considering prediction error over trajectory segment

9. Context-conditional Multi-headed Dynamics Model ● We also introduce context encoder for online adaptation to unseen environments ● Context encoder g captures contextual information from past experience ● See [Lee’20] for more information [Lee’20] Lee, Kimin, Younggyo Seo, Seunghyun Lee, Honglak Lee, Jinwoo Shin. "Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning." In ICML. 2020.

10. Analysis on Trajectory-wise MCL Transitions Trajectory segment ● Specialization leads to superior generalization performance Hopper

11. Analysis on Adaptive Planning ● Qualitative analysis ○ Manually assign prediction heads specialized for [mass: 2.5] to [mass: 1.0] [Mass: 1.0] with prediction heads specialized for [Mass: 2.5] [Mass: 2.5] with prediction heads specialized for [Mass: 2.5] Agent acts as if it has a heavyweight body!

12. Comparative Evaluation ● Superior generalization performance on unseen 6 environments

13. Conclusion ● For dynamics generalization ○ Context-conditional multi-headed dynamics model ○ Trajectory-wise multiple choice learning ○ Adaptive planning Thank you!

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Recomendados

Recomendados

Más contenido relacionado

Similar a Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Similar a Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning (8)

Más de ALINLAB

Más de ALINLAB (7)

Último

Último (20)

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning