Preparing for the Unknown: Learning a Universal Policy with Online System Identification
Wenhao Yu, Jie Tan, C. Karen Liu, Greg Turk

TL;DR
This paper introduces a universal control policy combined with online system identification, enabling robust performance across diverse and unknown dynamic environments, thereby improving adaptability and narrowing the simulation-to-reality gap.
Contribution
The paper presents a novel UP-OSI framework that integrates a universal policy with online system identification for adaptable control in unknown dynamics.
Findings
Effective across various tasks like cart-pole and locomotion
Outperforms universal policy alone on unseen models
Reduces the reality gap between simulation and real systems
Abstract
We present a new method of learning control policies that successfully operate under unknown dynamic models. We create such policies by leveraging a large number of training examples that are generated using a physical simulator. Our system is made of two components: a Universal Policy (UP) and a function for Online System Identification (OSI). We describe our control policy as universal because it is trained over a wide array of dynamic models. These variations in the dynamic model may include differences in mass and inertia of the robots' components, variable friction coefficients, or unknown mass of an object to be manipulated. By training the Universal Policy with this variation, the control policy is prepared for a wider array of possible conditions when executed in an unknown environment. The second part of our system uses the recent state and action history of the system to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Hydraulic and Pneumatic Systems
