Towards Mitigating Architecture Overfitting on Distilled Datasets
Xuyang Zhong, Chen Liu

TL;DR
This paper proposes methods like DropPath and knowledge distillation to reduce architecture overfitting in dataset distillation, enabling models to generalize better across different architectures and sizes.
Contribution
The paper introduces novel techniques to mitigate architecture overfitting in dataset distillation, improving generalization across various network architectures and capacities.
Findings
Significantly reduces architecture overfitting across tasks.
Enables training with distilled datasets to generalize to larger test networks.
Achieves comparable or superior performance on larger test architectures.
Abstract
Dataset distillation methods have demonstrated remarkable performance for neural networks trained with very limited training data. However, a significant challenge arises in the form of \textit{architecture overfitting}: the distilled training dataset synthesized by a specific network architecture (i.e., training network) generates poor performance when trained by other network architectures (i.e., test networks), especially when the test networks have a larger capacity than the training network. This paper introduces a series of approaches to mitigate this issue. Among them, DropPath renders the large model to be an implicit ensemble of its sub-networks, and knowledge distillation ensures each sub-network acts similarly to the small but well-performing teacher network. These methods, characterized by their smoothing effects, significantly mitigate architecture overfitting. We conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Neural Networks and Applications
