Improving Neural ODEs via Knowledge Distillation
Haoyu Chu, Shikui Wei, Qiming Lu, Yao Zhao

TL;DR
This paper enhances Neural ODEs for image recognition by applying knowledge distillation from ResNet teachers, significantly improving accuracy and robustness against adversarial attacks.
Contribution
It introduces a novel training method for Neural ODEs using knowledge distillation from ResNet teachers, boosting performance on image tasks.
Findings
24% accuracy improvement on CIFAR10
5% accuracy improvement on SVHN
Enhanced robustness to adversarial examples
Abstract
Neural Ordinary Differential Equations (Neural ODEs) construct the continuous dynamics of hidden units using ordinary differential equations specified by a neural network, demonstrating promising results on many tasks. However, Neural ODEs still do not perform well on image recognition tasks. The possible reason is that the one-hot encoding vector commonly used in Neural ODEs can not provide enough supervised information. We propose a new training based on knowledge distillation to construct more powerful and robust Neural ODEs fitting image recognition tasks. Specially, we model the training of Neural ODEs into a teacher-student learning process, in which we propose ResNets as the teacher model to provide richer supervised information. The experimental results show that the new training manner can improve the classification accuracy of Neural ODEs by 24% on CIFAR10 and 5% on SVHN. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Adversarial Robustness in Machine Learning
MethodsKnowledge Distillation
