Training Recurrent Neural Networks by Sequential Least Squares and the Alternating Direction Method of Multipliers
Alberto Bemporad

TL;DR
This paper introduces NAILS and NAILM, novel algorithms combining sequential least squares with ADMM for training recurrent neural networks, effectively handling convex, non-convex, and non-smooth regularization terms.
Contribution
It presents a new training algorithm for RNNs that integrates least squares and ADMM to handle complex regularization and constraints, improving convergence and flexibility.
Findings
Successfully applied to nonlinear system identification tasks
Handles non-smooth and non-convex regularizations effectively
Demonstrates convergence with various loss functions and constraints
Abstract
This paper proposes a novel algorithm for training recurrent neural network models of nonlinear dynamical systems from an input/output training dataset. Arbitrary convex and twice-differentiable loss functions and regularization terms are handled by sequential least squares and either a line-search (LS) or a trust-region method of Levenberg-Marquardt (LM) type for ensuring convergence. In addition, to handle non-smooth regularization terms such as , , and group-Lasso regularizers, as well as to impose possibly non-convex constraints such as integer and mixed-integer constraints, we combine sequential least squares with the alternating direction method of multipliers (ADMM). We call the resulting algorithm NAILS (nonconvex ADMM iterations and least squares) in the case line search (LS) is used, or NAILM if a trust-region method (LM) is employed instead. The training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsControl Systems and Identification · Neural Networks and Applications · Blind Source Separation Techniques
MethodsAlternating Direction Method of Multipliers
