TransMASK: Masked State Representation through Learned Transformation
Sagar Parekh, Preston Culbertson, and Dylan P. Losey

TL;DR
TransMASK is a self-supervised method that learns to mask irrelevant state components, enhancing robot policy generalization across environments by focusing on task-relevant information without extra labels.
Contribution
It introduces a novel mask learning technique integrated with imitation learning, aligning the mask with the Jacobian of the expert policy to identify relevant state features.
Findings
Improves policy robustness to environmental changes.
Can be combined with various imitation learning frameworks.
Outperforms existing methods in relevant state extraction.
Abstract
Humans train robots to complete tasks in one environment, and expect robots to perform those same tasks in new environments. As humans, we know which aspects of the environment (i.e., the state) are relevant to the task. But there are also things that do not matter; e.g., the color of the table or the presence of clutter in the background. Ideally, the robot's policy learns to ignore these irrelevant state components. Achieving this invariance improves generalization: the robot knows not to factor irrelevant variables into its control decisions, making the policy more robust to environment changes. In this paper we therefore propose a self-supervised method to learn a mask which, when multiplied by the observed state, transforms that state into a latent representation that is biased towards relevant elements. Our method -- which we call TransMASK -- can be combined with a variety of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Robot Manipulation and Learning
