Finite-Horizon Constrained MDPs With Both Additive And Multiplicative Utilities
Uday Kumar M, Sanjay P Bhat, Veeraruna Kavitha, and Nandyala, Hemachandra

TL;DR
This paper addresses finite-horizon constrained MDPs with combined additive and multiplicative utilities by transforming the problem into an equivalent additive-only CMDP and solving a bilinear program to find optimal policies.
Contribution
It introduces a novel approach to handle mixed utility types in CMDPs by constructing an equivalent additive-only CMDP and formulating a bilinear program for optimal policy computation.
Findings
Equivalent CMDP with only additive utilities is constructed.
A finite-dimensional bilinear program is formulated and solved.
The approach provides optimal policies for complex utility structures.
Abstract
This paper considers the problem of finding a solution to the finite horizon constrained Markov decision processes (CMDP) where the objective as well as constraints are sum of additive and multiplicative utilities. Towards solving this, we construct another CMDP, with only additive utilities under a restricted set of policies, whose optimal value is equal to that of the original CMDP. Furthermore, we provide a finite dimensional bilinear program (BLP) whose value equals the CMDP value and whose solution provides the optimal policy. We also suggest an algorithm to solve this BLP.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Advanced Control Systems Optimization
