Uncertainty Aware System Identification with Universal Policies
Buddhika Laknath Semage, Thommen George Karimpanal, Santu Rana and, Svetha Venkatesh

TL;DR
This paper introduces UncAPS, a method combining universal policies and Bayesian optimization to efficiently identify environment parameters and improve sim2real transfer robustness in noisy control tasks.
Contribution
It proposes a novel uncertainty-aware policy search method that leverages universal policies and Bayesian optimization for efficient environment parameter identification.
Findings
UncAPS outperforms baseline methods in noisy control environments.
The approach effectively accounts for aleatoric and epistemic uncertainties.
Empirical results demonstrate improved robustness and transfer performance.
Abstract
Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
