Decomposed Mutual Information Optimization for Generalized Context in   Meta-Reinforcement Learning

Yao Mu; Yuzheng Zhuang; Fei Ni; Bin Wang; Jianyu Chen; Jianye Hao,; Ping Luo

arXiv:2210.04209·cs.LG·October 11, 2022

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Yao Mu, Yuzheng Zhuang, Fei Ni, Bin Wang, Jianyu Chen, Jianye Hao,, Ping Luo

PDF

Open Access 1 Repo

TL;DR

This paper introduces DOMINO, a method that learns disentangled contexts to improve adaptation in meta-reinforcement learning by maximizing mutual information and reducing sample complexity in complex, confounded environments.

Contribution

The paper proposes DOMINO, a novel approach for disentangled context learning that enhances dynamics generalization in meta-reinforcement learning under multi-confounded challenges.

Findings

01

DOMINO improves sample efficiency in unseen environments.

02

Disentangled contexts lead to better dynamics adaptation.

03

DOMINO outperforms existing methods in complex scenarios.

Abstract

Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by Decomposed Mutual INformation Optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories, while minimizing the state transition prediction error. Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YaoMarkMu/DOMINO_MB-MetaRL
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Advanced Bandit Algorithms Research