Off-Policy Evaluation with Online Adaptation for Robot Exploration in   Challenging Environments

Yafei Hu; Junyi Geng; Chen Wang; John Keller; and Sebastian Scherer

arXiv:2204.03140·cs.RO·May 26, 2023

Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments

Yafei Hu, Junyi Geng, Chen Wang, John Keller, and Sebastian Scherer

PDF

Open Access

TL;DR

This paper introduces an off-policy evaluation method with online adaptation that enables robots to predict future state values, improving exploration efficiency in complex real-world environments like subterranean and urban areas.

Contribution

It presents a novel approach combining offline Monte-Carlo training and online TD adaptation for robot exploration, with a new intrinsic reward based on sensor coverage.

Findings

01

Enhanced prediction accuracy of future states.

02

Improved exploration performance over existing methods.

03

First demonstration of value function prediction in real-world challenging environments.

Abstract

Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This paper presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem for robot exploration (OPERE). It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. We also design an intrinsic reward function based on sensor information coverage to enable the robot to gain more information with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Context-Aware Activity Recognition Systems · Multimodal Machine Learning Applications