Correcting Auto-Differentiation in Neural-ODE Training

Yewei Xu; Shi Chen; and Qin Li

arXiv:2306.02192·cs.LG·March 31, 2026·1 cites

Correcting Auto-Differentiation in Neural-ODE Training

Yewei Xu, Shi Chen, and Qin Li

PDF

TL;DR

This paper investigates the inaccuracies in auto-differentiation when training neural ODEs with high-order methods, proposing post-processing techniques to correct gradient oscillations and improve convergence.

Contribution

It identifies the problem of artificial oscillations in gradients caused by auto-differentiation with high-order methods and offers simple correction techniques.

Findings

01

Auto-differentiation can introduce artificial oscillations in gradients for high-order methods.

02

Post-processing techniques effectively eliminate oscillations and correct gradients.

03

Corrected gradients lead to better convergence in neural ODE training.

Abstract

Does the use of auto-differentiation yield reasonable updates for deep neural networks (DNNs)? Specifically, when DNNs are designed to adhere to neural ODE architectures, can we trust the gradients provided by auto-differentiation? Through mathematical analysis and numerical evidence, we demonstrate that when neural networks employ high-order methods, such as Linear Multistep Methods (LMM) or Explicit Runge-Kutta Methods (ERK), to approximate the underlying ODE flows, brute-force auto-differentiation often introduces artificial oscillations in the gradients that prevent convergence. In the case of Leapfrog and 2-stage ERK, we propose simple post-processing techniques that effectively eliminates these oscillations, correct the gradient computation and thus returns the accurate updates.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.