MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared   Semantic Spaces

Tianyu Zheng; Ge Zhang; Xingwei Qu; Ming Kuang; Stephen W. Huang; and; Zhaofeng He

arXiv:2402.12845·cs.AI·February 21, 2024·1 cites

MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces

Tianyu Zheng, Ge Zhang, Xingwei Qu, Ming Kuang, Stephen W. Huang, and, Zhaofeng He

PDF

Open Access 1 Repo

TL;DR

This paper introduces MORE-3S, a multimodal offline reinforcement learning method that aligns visual and textual data into a shared semantic space, improving decision-making and strategic planning in RL tasks.

Contribution

It presents a novel approach that transforms offline RL into a supervised learning problem using multimodal and pre-trained language models for better state and action understanding.

Findings

01

Outperforms existing baselines on Atari and OpenAI Gym environments.

02

Enhances RL training performance through multimodal semantic alignment.

03

Promotes long-term strategic thinking in RL agents.

Abstract

Drawing upon the intuition that aligning different modalities to the same semantic embedding space would allow models to understand states and actions more easily, we propose a new perspective to the offline reinforcement learning (RL) challenge. More concretely, we transform it into a supervised learning task by integrating multimodal and pre-trained language models. Our approach incorporates state information derived from images and action-related data obtained from text, thereby bolstering RL training performance and promoting long-term strategic thinking. We emphasize the contextual understanding of language and demonstrate how decision-making in RL can benefit from aligning states' and actions' representation with languages' representation. Our method significantly outperforms current baselines as evidenced by evaluations conducted on Atari and OpenAI Gym environments. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zheng0428/more_
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics