A unified worst case for classical simplex and policy iteration pivot rules
Yann Disser, Nils Mosis

TL;DR
This paper constructs a family of Markov decision processes and linear programs demonstrating that several classical pivot rules for the simplex and policy iteration algorithms have exponential worst-case complexity, unifying and extending known lower bounds.
Contribution
It introduces a unified construction showing exponential lower bounds for multiple pivot rules in both policy iteration and simplex algorithms, highlighting their limitations.
Findings
Exponential lower bounds for Dantzig's, Bland's, and Largest Increase pivot rules.
Unified construction reproduces known lower bounds for these rules.
Any combination of these pivot rules cannot avoid exponential worst-case behavior.
Abstract
We construct a family of Markov decision processes for which the policy iteration algorithm needs an exponential number of improving switches with Dantzig's rule, with Bland's rule, and with the Largest Increase pivot rule. This immediately translates to a family of linear programs for which the simplex algorithm needs an exponential number of pivot steps with the same three pivot rules. Our results yield a unified construction that simultaneously reproduces well-known lower bounds for these classical pivot rules, and we are able to infer that any (deterministic or randomized) combination of them cannot avoid an exponential worst-case behavior. Regarding the policy iteration algorithm, pivot rules typically switch multiple edges simultaneously and our lower bound for Dantzig's rule and the Largest Increase rule, which perform only single switches, seem novel. Regarding the simplex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
