Mean-Field Generalisation Bounds for Learning Controls in Stochastic Environments
Boris Baros, Samuel N. Cohen, Christoph Reisinger

TL;DR
This paper develops mean-field generalisation bounds for control policies learned via neural networks in stochastic environments, ensuring stability and performance with finite data in overparameterised settings.
Contribution
It introduces a novel mean-field framework for control in stochastic systems and derives non-asymptotic generalisation bounds for neural network-based controls.
Findings
Non-asymptotic bounds on generalisation error for learned controls
Connections established between control learning and stochastic gradient descent
Numerical results demonstrate effectiveness on classic control problems
Abstract
We consider a data-driven formulation of the classical discrete-time stochastic control problem. Our approach exploits the natural structure of many such problems, in which significant portions of the system are uncontrolled. Employing the dynamic programming principle and the mean-field interpretation of single-hidden layer neural networks, we formulate the control problem as a series of infinite-dimensional minimisation problems. When regularised carefully, we provide practically verifiable assumptions for non-asymptotic bounds on the generalisation error achieved by the minimisers to this problem, thus ensuring stability in overparametrised settings, for controls learned using finitely many observations. We explore connections to the traditional noisy stochastic gradient descent algorithm, and subsequently show promising numerical results for some classic control problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
