Provable Sim-to-real Transfer in Continuous Domain with Partial Observations
Jiachen Hu, Han Zhong, Chi Jin, Liwei Wang

TL;DR
This paper provides a theoretical analysis of sim-to-real transfer for RL in continuous domains with partial observations, demonstrating that robust adversarial training can produce policies competitive with optimal real-world policies.
Contribution
It introduces a new algorithm for infinite-horizon average-cost LQGs with a regret bound, and a novel history clipping scheme for sim-to-real transfer in partial observation settings.
Findings
Robust adversarial training yields policies close to optimal in real environments.
The proposed algorithm has a regret bound depending on model complexity.
A novel history clipping scheme enhances the transfer process.
Abstract
Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world. Sim-to-real transfer has been widely used in practice because it is often cheaper, safer and much faster to collect samples in simulation than in the real world. Despite the empirical success of the sim-to-real transfer, its theoretical foundation is much less understood. In this paper, we study the sim-to-real transfer in continuous domain with partial observations, where the simulated environments and real-world environments are modeled by linear quadratic Gaussian (LQG) systems. We show that a popular robust adversarial training algorithm is capable of learning a policy from the simulated environment that is competitive to the optimal policy in the real-world environment. To achieve our results, we design a new algorithm for infinite-horizon average-cost LQGs and establish a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Influenza Virus Research Studies · Data Stream Mining Techniques
