Investigating the Robustness of Subtask Distillation under Spurious Correlation
Pattarawat Chormai, Klaus-Robert M\"uller, Gr\'egoire Montavon

TL;DR
This paper evaluates how different subtask distillation methods perform when the training data contains spurious correlations, highlighting the robustness of recent methods like SubDistill compared to baselines.
Contribution
It provides a systematic analysis of distillation methods' robustness to spurious correlations, emphasizing the effectiveness of SubDistill in such scenarios.
Findings
SubDistill remains robust as correlation strength increases.
Baseline methods degrade to near-random performance with stronger correlations.
The study highlights challenges of knowledge distillation on real-world, imperfect datasets.
Abstract
Subtask distillation is an emerging paradigm in which compact, specialized models are extracted from large, general-purpose 'foundation models' for deployment in environments with limited resources or in standalone computer systems. Although distillation uses a teacher model, it still relies on a dataset that is often limited in size and may lack representativeness or exhibit spurious correlations. In this paper, we evaluate established distillation methods, as well as the recent SubDistill method, when using data with spurious correlations for distillation. As the strength of the correlations increases, we observe a widening gap between advanced methods, such as SubDistill, which remain fairly robust, and some baseline methods, which degrade to near-random performance. Overall, our study underscores the challenges of knowledge distillation when applied to imperfect, real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Mobile Crowdsensing and Crowdsourcing
