Training Recurrent Neural Networks by Sequential Least Squares and the   Alternating Direction Method of Multipliers

Alberto Bemporad

arXiv:2112.15348·cs.LG·October 18, 2022

Training Recurrent Neural Networks by Sequential Least Squares and the Alternating Direction Method of Multipliers

Alberto Bemporad

PDF

Open Access

TL;DR

This paper introduces NAILS and NAILM, novel algorithms combining sequential least squares with ADMM for training recurrent neural networks, effectively handling convex, non-convex, and non-smooth regularization terms.

Contribution

It presents a new training algorithm for RNNs that integrates least squares and ADMM to handle complex regularization and constraints, improving convergence and flexibility.

Findings

01

Successfully applied to nonlinear system identification tasks

02

Handles non-smooth and non-convex regularizations effectively

03

Demonstrates convergence with various loss functions and constraints

Abstract

This paper proposes a novel algorithm for training recurrent neural network models of nonlinear dynamical systems from an input/output training dataset. Arbitrary convex and twice-differentiable loss functions and regularization terms are handled by sequential least squares and either a line-search (LS) or a trust-region method of Levenberg-Marquardt (LM) type for ensuring convergence. In addition, to handle non-smooth regularization terms such as $ℓ_{1}$ , $ℓ_{0}$ , and group-Lasso regularizers, as well as to impose possibly non-convex constraints such as integer and mixed-integer constraints, we combine sequential least squares with the alternating direction method of multipliers (ADMM). We call the resulting algorithm NAILS (nonconvex ADMM iterations and least squares) in the case line search (LS) is used, or NAILM if a trust-region method (LM) is employed instead. The training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsControl Systems and Identification · Neural Networks and Applications · Blind Source Separation Techniques

MethodsAlternating Direction Method of Multipliers