TL;DR
This paper introduces a mixed precision training framework for neural ODEs that reduces memory and computation costs while maintaining accuracy, using custom schemes for stability and an open-source PyTorch package.
Contribution
The authors develop a novel mixed precision training scheme specifically for neural ODEs, addressing stability and efficiency challenges with a custom backpropagation method and dynamic scaling.
Findings
Achieves approximately 50% memory reduction.
Provides up to 2x speedup in training.
Maintains accuracy comparable to single-precision training.
Abstract
Exploiting low-precision computations has become a standard strategy in deep learning to address the growing computational costs imposed by ever larger models and datasets. However, naively performing all computations in low precision can lead to roundoff errors and instabilities. Therefore, mixed precision training schemes usually store the weights in high precision and use low-precision computations only for whitelisted operations. Despite their success, these principles are currently not reliable for training continuous-time architectures such as neural ordinary differential equations (Neural ODEs). This paper presents a mixed precision training framework for neural ODEs consisting of explicit ODE solvers and a custom backpropagation scheme and shows their effectiveness in a range of learning tasks. Our scheme uses low-precision computations for evaluating the velocity, parameterized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
