Improving Neural ODEs via Knowledge Distillation

Haoyu Chu; Shikui Wei; Qiming Lu; Yao Zhao

arXiv:2203.05103·cs.CV·January 11, 2024

Improving Neural ODEs via Knowledge Distillation

Haoyu Chu, Shikui Wei, Qiming Lu, Yao Zhao

PDF

Open Access

TL;DR

This paper enhances Neural ODEs for image recognition by applying knowledge distillation from ResNet teachers, significantly improving accuracy and robustness against adversarial attacks.

Contribution

It introduces a novel training method for Neural ODEs using knowledge distillation from ResNet teachers, boosting performance on image tasks.

Findings

01

24% accuracy improvement on CIFAR10

02

5% accuracy improvement on SVHN

03

Enhanced robustness to adversarial examples

Abstract

Neural Ordinary Differential Equations (Neural ODEs) construct the continuous dynamics of hidden units using ordinary differential equations specified by a neural network, demonstrating promising results on many tasks. However, Neural ODEs still do not perform well on image recognition tasks. The possible reason is that the one-hot encoding vector commonly used in Neural ODEs can not provide enough supervised information. We propose a new training based on knowledge distillation to construct more powerful and robust Neural ODEs fitting image recognition tasks. Specially, we model the training of Neural ODEs into a teacher-student learning process, in which we propose ResNets as the teacher model to provide richer supervised information. The experimental results show that the new training manner can improve the classification accuracy of Neural ODEs by 24% on CIFAR10 and 5% on SVHN. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Adversarial Robustness in Machine Learning

MethodsKnowledge Distillation