Combating the Compounding-Error Problem with a Multi-step Model

Kavosh Asadi; Dipendra Misra; Seungchan Kim; Michel L. Littman

arXiv:1905.13320·cs.LG·June 3, 2019·27 cites

Combating the Compounding-Error Problem with a Multi-step Model

Kavosh Asadi, Dipendra Misra, Seungchan Kim, Michel L. Littman

PDF

Open Access

TL;DR

This paper introduces a multi-step model in model-based reinforcement learning to directly predict outcomes of action sequences, reducing error accumulation and improving planning accuracy.

Contribution

It proposes a novel multi-step model that directly predicts multi-action outcomes, addressing the compounding-error problem in traditional one-step models.

Findings

01

Multi-step models improve value-function estimation.

02

Multi-step models lead to better action selection.

03

Theoretical and empirical evidence supports multi-step approach.

Abstract

Model-based reinforcement learning is an appealing framework for creating agents that learn, plan, and act in sequential environments. Model-based algorithms typically involve learning a transition model that takes a state and an action and outputs the next state---a one-step model. This model can be composed with itself to enable predicting multiple steps into the future, but one-step prediction errors can get magnified, leading to unacceptable inaccuracy. This compounding-error problem plagues planning and undermines model-based reinforcement learning. In this paper, we address the compounding-error problem by introducing a multi-step model that directly outputs the outcome of executing a sequence of actions. Novel theoretical and empirical results indicate that the multi-step model is more conducive to efficient value-function estimation, and it yields better action selection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research · Risk and Safety Analysis · Machine Learning and Algorithms