Convergence in On-line Learning of Static and Dynamic Systems
Torbj\"orn Wigren, Ruoqi Zhang, Per Mattsson

TL;DR
This paper analytically characterizes the asymptotic behavior of the ADAM algorithm in online nonlinear system identification, revealing its convergence properties and equivalence to other stochastic gradient methods under certain conditions.
Contribution
It provides the first analytical expressions for ADAM's asymptotic update direction in recursive nonlinear system identification, including special cases and convergence guarantees.
Findings
ADAM's asymptotic update direction matches that of a diagonally normalized stochastic gradient.
With internal filtering off, ADAM behaves like a sign-sign stochastic gradient algorithm.
Monte-Carlo simulations validate the theoretical convergence results.
Abstract
The paper derives analytical expressions for the asymptotic average updating direction of the adaptive moment generation (ADAM) algorithm when applied to recursive identification of nonlinear systems. It is proved that the standard hyper-parameter setting results in the same asymptotic average updating direction as a diagonally power normalized stochastic gradient algorithm. With the internal filtering turned off, the asymptotic average updating direction is instead equivalent to that of a sign-sign stochastic gradient algorithm. Global convergence to an invariant set follows, where a subset of parameters contain those that give a correct input-output description of the system. The paper also exploits a nonlinear dynamic model to embed structure in recurrent neural networks. A Monte-Carlo simulation study validates the results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExperimental Learning in Engineering · Control Systems and Identification · Advanced Control Systems Optimization
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sparse Evolutionary Training
