A Linearly Relaxed Approximate Linear Program for Markov Decision Processes
Chandrashekar Lakshminarayanan, Shalabh Bhatnagar, Csaba Szepesvari

TL;DR
This paper introduces LRALP, a tractable approximation of ALP for large MDPs, with a new performance bound that improves the feasibility of solving complex decision processes.
Contribution
The paper proposes LRALP, a novel linear relaxation of ALP with a performance bound, enabling efficient solutions for large-scale MDPs.
Findings
LRALP has a tractable number of constraints.
A new performance bound for LRALP is established.
LRALP improves computational feasibility for large MDPs.
Abstract
Approximate linear programming (ALP) and its variants have been widely applied to Markov Decision Processes (MDPs) with a large number of states. A serious limitation of ALP is that it has an intractable number of constraints, as a result of which constraint approximations are of interest. In this paper, we define a linearly relaxed approximation linear program (LRALP) that has a tractable number of constraints, obtained as positive linear combinations of the original constraints of the ALP. The main contribution is a novel performance bound for LRALP.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Scheduling and Optimization Algorithms · Advanced Control Systems Optimization
