Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free   Ensembles of DNNs

Uri Stern; Daphna Weinshall

arXiv:2310.11094·cs.LG·December 29, 2023·1 cites

Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free Ensembles of DNNs

Uri Stern, Daphna Weinshall

PDF

Open Access

TL;DR

This paper introduces a new overfit measure based on forgetting rates, revealing that overfit can occur without accuracy drops, and proposes a training-history-based ensemble method that improves deep model performance efficiently.

Contribution

The paper presents a novel overfit score and a training-history-based ensemble method that enhances deep neural network performance without additional training cost.

Findings

01

Overfit can occur without validation accuracy decrease.

02

The proposed ensemble method improves performance across datasets.

03

Method outperforms comparable approaches and boosts Imagenet accuracy by 1%.

Abstract

The infrequent occurrence of overfit in deep neural networks is perplexing. On the one hand, theory predicts that as models get larger they should eventually become too specialized for a specific training set, with ensuing decrease in generalization. In contrast, empirical results in image classification indicate that increasing the training time of deep models or using bigger models almost never hurts generalization. Is it because the way we measure overfit is too limited? Here, we introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data. Presumably, this score indicates that even while generalization improves overall, there are certain regions of the data space where it deteriorates. When thus measured, we show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications