Learning from Matured Dumb Teacher for Fine Generalization

HeeSeung Jung; Kangil Kim; Hoyong Kim; Jong-Hun Shin

arXiv:2108.05776·cs.LG·October 28, 2024

Learning from Matured Dumb Teacher for Fine Generalization

HeeSeung Jung, Kangil Kim, Hoyong Kim, Jong-Hun Shin

PDF

Open Access

TL;DR

This paper introduces a matured dumb teacher knowledge distillation method that conservatively transfers decision boundary hypotheses, leading to improved generalization in neural networks across multiple image classification datasets.

Contribution

It proposes a novel matured dumb teacher KD approach that enhances generalization by preserving decision boundary hypotheses without destroying trained information.

Findings

01

Consistent improvement in test performance across datasets

02

Finer generalization compared to existing methods

03

Stable results over hyperparameter grid search

Abstract

The flexibility of decision boundaries in neural networks that are unguided by training data is a well-known problem typically resolved with generalization methods. A surprising result from recent knowledge distillation (KD) literature is that random, untrained, and equally structured teacher networks can also vastly improve generalization performance. It raises the possibility of existence of undiscovered assumptions useful for generalization on an uncertain region. In this paper, we shed light on the assumptions by analyzing decision boundaries and confidence distributions of both simple and KD-based generalization methods. Assuming that a decision boundary exists to represent the most general tendency of distinction on an input sample space (i.e., the simplest hypothesis), we show the various limitations of methods when using the hypothesis. To resolve these limitations, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning

MethodsKnowledge Distillation · Convolution