Convolutional Neural Networks combined with Runge-Kutta Methods
Mai Zhu, Bo Chang, Chong Fu

TL;DR
This paper introduces Runge-Kutta Convolutional Neural Networks (RKCNNs), a new class of models that leverage high-order Runge-Kutta methods for efficient and accurate neural network training inspired by dynamical systems.
Contribution
The paper reinterprets ResNets as dynamical systems and proposes RKCNNs using high-order Runge-Kutta methods to improve efficiency and accuracy over existing models.
Findings
RKCNNs outperform other dynamical system models in accuracy.
RKCNNs require fewer resources for comparable or better performance.
Experimental results validate the effectiveness of RKCNNs on benchmark datasets.
Abstract
A convolutional neural network can be constructed using numerical methods for solving dynamical systems, since the forward pass of the network can be regarded as a trajectory of a dynamical system. However, existing models based on numerical solvers cannot avoid the iterations of implicit methods, which makes the models inefficient at inference time. In this paper, we reinterpret the pre-activation Residual Networks (ResNets) and their variants from the dynamical systems view. We consider that the iterations of implicit Runge-Kutta methods are fused into the training of these models. Moreover, we propose a novel approach to constructing network models based on high-order Runge-Kutta methods in order to achieve higher efficiency. Our proposed models are referred to as the Runge-Kutta Convolutional Neural Networks (RKCNNs). The RKCNNs are evaluated on multiple benchmark datasets. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
