Performance Guarantees for Data-Driven Sequential Decision-Making
Bowen Li, Edwin K. P. Chong, Ali Pezeshki

TL;DR
This paper introduces a general framework to quantify how close approximate dynamic programming schemes come to optimal solutions in sequential decision-making, with applications in robotics and sensor networks.
Contribution
It develops a novel theoretical framework providing ratio-based performance guarantees for ADP schemes, applicable to various real-world problems.
Findings
ADP schemes achieve at least a certain fraction of the optimal value.
The framework applies to data-driven robot path planning.
The framework applies to multi-agent sensor coverage.
Abstract
The solutions to many sequential decision-making problems are characterized by dynamic programming and Bellman's principle of optimality. However, due to the inherent complexity of solving Bellman's equation exactly, there has been significant interest in developing various approximate dynamic programming (ADP) schemes to obtain near-optimal solutions. A fundamental question that arises is: how close are the objective values produced by ADP schemes relative to the true optimal objective values? In this paper, we develop a general framework that provides performance guarantees for ADP schemes in the form of ratio bounds. Specifically, we show that the objective value under an ADP scheme is at least a computable fraction of the optimal value. We further demonstrate the applicability of our theoretical framework through two applications: data-driven robot path planning and multi-agent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Distributed Control Multi-Agent Systems
