Exploiting a Zoo of Checkpoints for Unseen Tasks
Jiaji Huang, Qiang Qiu, Kenneth Church

TL;DR
This paper introduces a method to select representative model checkpoints using Gaussian processes and mutual information, enabling better generalization to unseen tasks in NLP and computer vision.
Contribution
It proposes a novel approach to identify checkpoint subsets that effectively cover the task space, improving transferability to new tasks.
Findings
Selected checkpoints outperform random choices on unseen tasks.
The method is effective across NLP and computer vision domains.
The approach leverages unlabeled data for model selection.
Abstract
There are so many models in the literature that it is difficult for practitioners to decide which combinations are likely to be effective for a new task. This paper attempts to address this question by capturing relationships among checkpoints published on the web. We model the space of tasks as a Gaussian process. The covariance can be estimated from checkpoints and unlabeled probing data. With the Gaussian process, we can identify representative checkpoints by a maximum mutual information criterion. This objective is submodular. A greedy method identifies representatives that are likely to "cover" the task space. These representatives generalize to new tasks with superior performance. Empirical evidence is provided for applications from both computational linguistics as well as computer vision.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
