Boosting Action-Information via a Variational Bottleneck on Unlabelled Robot Videos
Haoyu Zhang, Long Cheng

TL;DR
This paper presents a novel variational bottleneck framework that maximizes mutual information between latent and true actions in unlabeled robot videos, leading to improved control performance without requiring action labels.
Contribution
It introduces a new method leveraging the variational information-bottleneck to learn action-relevant representations from unlabeled videos, backed by theoretical analysis and extensive experiments.
Findings
Significantly increases mutual information between latent and true actions.
Improves policy performance in simulated and real-world robotic tasks.
Outperforms existing methods in unlabeled video-based learning.
Abstract
Learning from demonstrations (LfD) typically relies on large amounts of action-labeled expert trajectories, which fundamentally constrains the scale of available training data. A promising alternative is to learn directly from unlabeled video demonstrations. However, we find that existing methods tend to encode latent actions that share little mutual information with the true robot actions, leading to suboptimal control performance. To address this limitation, we introduce a novel framework that explicitly maximizes the mutual information between latent actions and true actions, even in the absence of action labels. Our method leverage the variational information-bottleneck to extract action-relevant representations while discarding task-irrelevant information. We provide a theoretical analysis showing that our objective indeed maximizes the mutual information between latent and true…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Generative Adversarial Networks and Image Synthesis
