Learning via nonlinear conjugate gradients and depth-varying neural ODEs

George Baravdish; Gabriel Eilertsen; Rym Jaroudi; B. Tomas Johansson,; Luk\'a\v{s} Mal\'y; Jonas Unger

arXiv:2202.05766·cs.LG·February 14, 2022

Learning via nonlinear conjugate gradients and depth-varying neural ODEs

George Baravdish, Gabriel Eilertsen, Rym Jaroudi, B. Tomas Johansson,, Luk\'a\v{s} Mal\'y, Jonas Unger

PDF

Open Access

TL;DR

This paper introduces a novel method for reconstructing depth-varying parameters in neural ODEs using nonlinear conjugate gradients, demonstrating improved stability and smoothness in deep learning models with infinite layers.

Contribution

It develops a new parameter reconstruction approach for neural ODEs based on nonlinear conjugate gradients, including mathematical analysis and a Sobolev gradient for stability.

Findings

01

Method performs well on synthetic datasets

02

Outperforms standard gradient approaches

03

Ensures stability and smoothness in deep networks

Abstract

The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in a neural ordinary differential equation (NODE) is considered, that means finding the weights of a residual network with time continuous layers. The NODE is treated as an isolated entity describing the full network as opposed to earlier research, which embedded it between pre- and post-appended layers trained by conventional methods. The proposed parameter reconstruction is done for a general first order differential equation by minimizing a cost functional covering a variety of loss functions and penalty terms. A nonlinear conjugate gradient method (NCG) is derived for the minimization. Mathematical properties are stated for the differential equation and the cost functional. The adjoint problem needed is derived together with a sensitivity problem. The sensitivity problem can estimate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Measurement and Metrology Techniques · Model Reduction and Neural Networks · Numerical methods in inverse problems

MethodsNeural Oblivious Decision Ensembles