Deflated Dynamics Value Iteration
Jongmin Lee, Amin Rakhsha, Ernest K. Ryu, Amir-massoud Farahmand

TL;DR
This paper introduces Deflated Dynamics Value Iteration (DDVI), a novel method that accelerates convergence in value iteration by removing dominant eigen-structures, and extends it to reinforcement learning with the DDTD algorithm.
Contribution
The paper proposes DDVI using matrix deflation to improve convergence rates in value iteration and extends it to RL with DDTD, demonstrating empirical effectiveness.
Findings
DDVI achieves faster convergence rates than standard VI.
Theoretical analysis shows convergence rate depends on the eigenvalues of the transition matrix.
Empirical results confirm improved performance in RL tasks.
Abstract
The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a function of iteration is , it is slow when the discount factor is close to . To accelerate the computation of the value function, we propose Deflated Dynamics Value Iteration (DDVI). DDVI uses matrix splitting and matrix deflation techniques to effectively remove (deflate) the top dominant eigen-structure of the transition matrix . We prove that this leads to a convergence rate, where is -th largest eigenvalue of the dynamics matrix. We then extend DDVI to the RL setting and present Deflated Dynamics Temporal Difference (DDTD) algorithm. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydraulic and Pneumatic Systems · Dynamics and Control of Mechanical Systems
