Patching open-vocabulary models by interpolating weights
Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song,, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt

TL;DR
This paper introduces PAINT, a weight interpolation method for patching open-vocabulary models like CLIP, significantly improving task-specific accuracy without retraining, and enabling broad transfer and multi-task patching.
Contribution
The paper presents PAINT, a novel weight interpolation technique for effective model patching, enhancing zero-shot performance on specific tasks while maintaining overall accuracy.
Findings
PAINT increases accuracy by 15-60 percentage points on nine tasks.
PAINT preserves ImageNet accuracy within 1% of the zero-shot model.
Patching on one task can improve performance on other tasks with disjoint classes.
Abstract
Open-vocabulary models like CLIP achieve high accuracy across many image classification tasks. However, there are still settings where their zero-shot performance is far from optimal. We study model patching, where the goal is to improve accuracy on specific tasks without degrading accuracy on tasks where performance is already adequate. Towards this goal, we introduce PAINT, a patching method that uses interpolations between the weights of a model before fine-tuning and the weights after fine-tuning on a task to be patched. On nine tasks where zero-shot CLIP performs poorly, PAINT increases accuracy by 15 to 60 percentage points while preserving accuracy on ImageNet within one percentage point of the zero-shot model. PAINT also allows a single model to be patched on multiple tasks and improves with model scale. Furthermore, we identify cases of broad transfer, where patching on one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsContrastive Language-Image Pre-training
