Learning via nonlinear conjugate gradients and depth-varying neural ODEs
George Baravdish, Gabriel Eilertsen, Rym Jaroudi, B. Tomas Johansson,, Luk\'a\v{s} Mal\'y, Jonas Unger

TL;DR
This paper introduces a novel method for reconstructing depth-varying parameters in neural ODEs using nonlinear conjugate gradients, demonstrating improved stability and smoothness in deep learning models with infinite layers.
Contribution
It develops a new parameter reconstruction approach for neural ODEs based on nonlinear conjugate gradients, including mathematical analysis and a Sobolev gradient for stability.
Findings
Method performs well on synthetic datasets
Outperforms standard gradient approaches
Ensures stability and smoothness in deep networks
Abstract
The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in a neural ordinary differential equation (NODE) is considered, that means finding the weights of a residual network with time continuous layers. The NODE is treated as an isolated entity describing the full network as opposed to earlier research, which embedded it between pre- and post-appended layers trained by conventional methods. The proposed parameter reconstruction is done for a general first order differential equation by minimizing a cost functional covering a variety of loss functions and penalty terms. A nonlinear conjugate gradient method (NCG) is derived for the minimization. Mathematical properties are stated for the differential equation and the cost functional. The adjoint problem needed is derived together with a sensitivity problem. The sensitivity problem can estimate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Measurement and Metrology Techniques · Model Reduction and Neural Networks · Numerical methods in inverse problems
MethodsNeural Oblivious Decision Ensembles
