Error-Aware Policy Learning: Zero-Shot Generalization in Partially   Observable Dynamic Environments

Visak Kumar; Sehoon Ha; C. Karen Liu

arXiv:2103.07732·cs.RO·March 16, 2021

Error-Aware Policy Learning: Zero-Shot Generalization in Partially Observable Dynamic Environments

Visak Kumar, Sehoon Ha, C. Karen Liu

PDF

TL;DR

This paper presents an error-aware policy learning method enabling zero-shot transfer of policies to new environments with unobservable factors, demonstrated on assistive walking devices and standard RL tasks.

Contribution

The paper introduces an error-aware policy (EAP) that explicitly accounts for unobservable factors, enabling zero-shot generalization in partially observable dynamic environments.

Findings

01

EAP successfully transfers to different human agents with unseen biomechanics.

02

The method generalizes to standard RL control tasks.

03

EAP improves robustness in sim-to-real transfer scenarios.

Abstract

Simulation provides a safe and efficient way to generate useful data for learning complex robotic tasks. However, matching simulation and real-world dynamics can be quite challenging, especially for systems that have a large number of unobserved or unmeasurable parameters, which may lie in the robot dynamics itself or in the environment with which the robot interacts. We introduce a novel approach to tackle such a sim-to-real problem by developing policies capable of adapting to new environments, in a zero-shot manner. Key to our approach is an error-aware policy (EAP) that is explicitly made aware of the effect of unobservable factors during training. An EAP takes as input the predicted future state error in the target environment, which is provided by an error-prediction function, simultaneously trained with the EAP. We validate our approach on an assistive walking device trained to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttentive Walk-Aggregating Graph Neural Network