Unrolling Dynamic Programming via Graph Filters
Sergio Rozada, Samuel Rey, Gonzalo Mateos, Antonio G. Marques

TL;DR
This paper introduces BellNet, a graph filter-based neural network that unrolls and truncates dynamic programming iterations, enabling efficient approximation of optimal policies in Markov decision processes.
Contribution
It presents a novel unrolling approach that unifies policy and value iteration using graph filters, reducing computational complexity and improving efficiency.
Findings
BellNet effectively approximates optimal policies with fewer iterations.
The method leverages graph signal processing for compact policy representation.
Preliminary results show significant speedup over classical dynamic programming.
Abstract
Dynamic programming (DP) is a fundamental tool used across many engineering fields. The main goal of DP is to solve Bellman's optimality equations for a given Markov decision process (MDP). Standard methods like policy iteration exploit the fixed-point nature of these equations to solve them iteratively. However, these algorithms can be computationally expensive when the state-action space is large or when the problem involves long-term dependencies. Here we propose a new approach that unrolls and truncates policy iterations into a learnable parametric model dubbed BellNet, which we train to minimize the so-termed Bellman error from random value function initializations. Viewing the transition probability matrix of the MDP as the adjacency of a weighted directed graph, we draw insights from graph signal processing to interpret (and compactly re-parameterize) BellNet as a cascade of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
