Analysis and Optimisation of Bellman Residual Errors with Neural   Function Approximation

Martin Gottwald (1); Sven Gronauer (1); Hao Shen (2); Klaus Diepold; (1) ((1) Technical University of Munich; (2) fortiss)

arXiv:2106.08774·cs.LG·March 15, 2022

Analysis and Optimisation of Bellman Residual Errors with Neural Function Approximation

Martin Gottwald (1), Sven Gronauer (1), Hao Shen (2), Klaus Diepold, (1) ((1) Technical University of Munich, (2) fortiss)

PDF

Open Access

TL;DR

This paper analyzes the Bellman residual error in neural network-based value function approximation within Deep Reinforcement Learning, proposing an efficient Newton-based optimization algorithm with theoretical and empirical validation.

Contribution

It introduces a novel Approximate Newton's algorithm for MSBE minimization, including a critical point analysis and two variations suitable for discrete and continuous settings.

Findings

01

The Gauss Newton Residual Gradient algorithm is locally quadratically convergent.

02

Over-parameterized neural networks can avoid suboptimal local minima under certain conditions.

03

Empirical results demonstrate the algorithm's effectiveness in continuous control problems.

Abstract

Recent development of Deep Reinforcement Learning (DRL) has demonstrated superior performance of neural networks in solving challenging problems with large or even continuous state spaces. One specific approach is to deploy neural networks to approximate value functions by minimising the Mean Squared Bellman Error (MSBE) function. Despite great successes of DRL, development of reliable and efficient numerical algorithms to minimise the MSBE is still of great scientific interest and practical demand. Such a challenge is partially due to the underlying optimisation problem being highly non-convex or using incomplete gradient information as done in Semi-Gradient algorithms. In this work, we analyse the MSBE from a smooth optimisation perspective and develop an efficient Approximate Newton's algorithm. First, we conduct a critical point analysis of the error function and provide technical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Reinforcement Learning in Robotics · Neural Networks and Applications