Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective

Haoran He; Peilin Wu; Chenjia Bai; Hang Lai; Lingxiao Wang; Ling Pan,; Xiaolin Hu; Weinan Zhang

arXiv:2305.18464·cs.LG·October 15, 2024·1 cites

Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective

Haoran He, Peilin Wu, Chenjia Bai, Hang Lai, Lingxiao Wang, Ling Pan,, Xiaolin Hu, Weinan Zhang

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a novel approach called Historical Information Bottleneck (HIB) that leverages privileged knowledge from historical trajectories to improve the transfer of reinforcement learning policies from simulation to real-world robotic control.

Contribution

The paper formulates the sim-to-real gap as an information bottleneck problem and proposes HIB to better utilize privileged knowledge for improved generalization.

Findings

01

HIB reduces the value discrepancy between oracle and learned policies.

02

Empirical results show HIB outperforms previous methods in simulated and real-world tasks.

03

Theoretical analysis supports the effectiveness of privileged knowledge representation.

Abstract

Reinforcement Learning (RL) has recently achieved remarkable success in robotic control. However, most works in RL operate in simulated environments where privileged knowledge (e.g., dynamics, surroundings, terrains) is readily available. Conversely, in real-world scenarios, robot agents usually rely solely on local states (e.g., proprioceptive feedback of robot joints) to select actions, leading to a significant sim-to-real gap. Existing methods address this gap by either gradually reducing the reliance on privileged knowledge or performing a two-stage policy imitation. However, we argue that these methods are limited in their ability to fully leverage the available privileged knowledge, resulting in suboptimal performance. In this paper, we formulate the sim-to-real gap as an information bottleneck problem and therefore propose a novel privileged knowledge distillation method called…

Peer Reviews

Decision·CoRL 2024

Reviewer 01Rating 4Confidence 3

Reviewer 02Rating 3Confidence 3

Reviewer 03Rating 3Confidence 3

Code & Models

Repositories

tinnerhrhe/HIB_Policy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsKnowledge Distillation