Autonomous exploration for navigating in non-stationary CMPs

Pratik Gajane; Ronald Ortner; Peter Auer; Csaba Szepesvari

arXiv:1910.08446·cs.LG·October 21, 2019·6 cites

Autonomous exploration for navigating in non-stationary CMPs

Pratik Gajane, Ronald Ortner, Peter Auer, Csaba Szepesvari

PDF

Open Access

TL;DR

This paper introduces a new framework for autonomous navigation in non-stationary controlled Markov processes, proposing a performance measure and a meta-algorithm with theoretical guarantees on exploration efficiency amid abrupt environment changes.

Contribution

It presents a novel performance measure and a meta-algorithm for navigation in non-stationary CMPs, with proven bounds on exploration steps related to environment changes.

Findings

01

Proposed the exploration steps as a new performance measure.

02

Developed the MNM meta-algorithm for non-stationary environments.

03

Proved upper bounds on exploration steps based on change count.

Abstract

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change. For this setting, we propose a performance measure called exploration steps which counts the time steps at which the learner lacks sufficient knowledge to navigate its environment efficiently. We devise a learning meta-algorithm, MNM and prove an upper bound on the exploration steps in terms of the number of changes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Data Stream Mining Techniques