Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free Ensembles of DNNs
Uri Stern, Daphna Weinshall

TL;DR
This paper introduces a new overfit measure based on forgetting rates, revealing that overfit can occur without accuracy drops, and proposes a training-history-based ensemble method that improves deep model performance efficiently.
Contribution
The paper presents a novel overfit score and a training-history-based ensemble method that enhances deep neural network performance without additional training cost.
Findings
Overfit can occur without validation accuracy decrease.
The proposed ensemble method improves performance across datasets.
Method outperforms comparable approaches and boosts Imagenet accuracy by 1%.
Abstract
The infrequent occurrence of overfit in deep neural networks is perplexing. On the one hand, theory predicts that as models get larger they should eventually become too specialized for a specific training set, with ensuing decrease in generalization. In contrast, empirical results in image classification indicate that increasing the training time of deep models or using bigger models almost never hurts generalization. Is it because the way we measure overfit is too limited? Here, we introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data. Presumably, this score indicates that even while generalization improves overall, there are certain regions of the data space where it deteriorates. When thus measured, we show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
