TIMRL: A Novel Meta-Reinforcement Learning Framework for Non-Stationary   and Multi-Task Environments

Chenyang Qi; Huiping Li; Panfeng Huang

arXiv:2501.07146·cs.LG·January 14, 2025

TIMRL: A Novel Meta-Reinforcement Learning Framework for Non-Stationary and Multi-Task Environments

Chenyang Qi, Huiping Li, Panfeng Huang

PDF

TL;DR

This paper introduces TIMRL, a meta-reinforcement learning framework that uses Gaussian mixture models and transformers to better adapt to non-stationary and multi-task environments, improving task inference and sample efficiency.

Contribution

The paper proposes a novel meta-RL method combining Gaussian mixture models and transformers for explicit task encoding in non-stationary environments.

Findings

01

Significantly improves sample efficiency in non-stationary environments

02

Accurately classifies and recognizes multiple tasks

03

Outperforms existing methods on MuJoCo benchmarks

Abstract

In recent years, meta-reinforcement learning (meta-RL) algorithm has been proposed to improve sample efficiency in the field of decision-making and control, enabling agents to learn new knowledge from a small number of samples. However, most research uses the Gaussian distribution to extract task representation, which is poorly adapted to tasks that change in non-stationary environment. To address this problem, we propose a novel meta-reinforcement learning method by leveraging Gaussian mixture model and the transformer network to construct task inference model. The Gaussian mixture model is utilized to extend the task representation and conduct explicit encoding of tasks. Specifically, the classification of tasks is encoded through transformer network to determine the Gaussian component corresponding to the task. By leveraging task labels, the transformer network is trained using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.