On the correctness of monadic backward induction
Nuria Brede, Nicola Botta

TL;DR
This paper formalizes the correctness of monadic backward induction in finite-horizon SDPs, broadening its applicability beyond traditional deterministic and stochastic cases, using category-theoretical conditions.
Contribution
It defines a correctness notion for monadic SDPs and proves conditions under which monadic backward induction is valid, extending classical results to a more general framework.
Findings
Backward induction is correct under certain general conditions.
The framework applies to deterministic, stochastic, and some other SDPs.
Some instances considered admissible are ruled out by the conditions.
Abstract
In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are routinely solved using Bellman's backward induction. Textbook authors (e.g. Bertsekas or Puterman) typically give more or less formal proofs to show that the backward induction algorithm is correct as solution method for deterministic and stochastic SDPs. Botta, Jansson and Ionescu propose a generic framework for finite horizon, monadic SDPs together with a monadic version of backward induction for solving such SDPs. In monadic SDPs, the monad captures a generic notion of uncertainty, while a generic measure function aggregates rewards. In the present paper we define a notion of correctness for monadic SDPs and identify three conditions that allow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
