CHIPS: Efficient CLIP Adaptation via Curvature-aware Hybrid Influence-based Data Selection
Xinlin Zhuang, Yichen Li, Xiwei Liu, Haolin Yang, Yifan Lu, Ziyun Zou, Yulong Li, Huifa Li, Dongliang Chen, Qinglei Wang, Weiyang Liu, Ying Qian, Jiangming Shi, Imran Razzak

TL;DR
This paper introduces CHIPS, a data selection method that effectively adapts CLIP to specific domains by selecting high-utility image-text pairs, reducing the need for large datasets and improving performance on medical and general benchmarks.
Contribution
CHIPS presents a novel, theoretically justified data selection approach that integrates curvature-aware alignment, scalable estimators, and relevance weighting for efficient CLIP adaptation.
Findings
CHIPS achieves state-of-the-art results among selection methods on medical benchmarks.
It matches full-dataset CPT performance with only 30% of the data.
It outperforms half-dataset CPT using just 10% of the data.
Abstract
Adapting CLIP to vertical domains is typically approached by novel fine-tuning strategies or by continual pre-training (CPT) on large domain-specific datasets. Yet, data itself remains an underexplored factor in this process. We revisit this task from a data-centric perspective: Can effective data selection substitute for large-scale datasets in CPT? We introduce CHIPS (Curvature-aware Hybrid Influence in Projection Subspace), which assigns each image-text pair a utility score that integrates three complementary factors aligned with three goals: faithfulness via a curvature-aware and Newton-style alignment computed in CLIP's end-point subspace; scalability via an InfoNCE-aware curvature estimator with Johnson-Lindenstrauss (JL) sketching; and retention via a selection-aware relevance weight combined with learnability to balance target adaptation against general-domain preservation. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_DOT_10model· 2 dl2 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_DOT_20model· 1 dl1 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_DOT_30model
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_TRACIN_10model· 1 dl1 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_TRACIN_20model· 1 dl1 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_TRACIN_30model· 2 dl2 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_CHIPS_10model· 1 dl1 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_CHIPS_20model· 1 dl1 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_CHIPS_30model· 1 dl1 dl
- 🤗Mihara-bot/metaclip-b16-400m-biomedica_TRAK_10model· 2 dl2 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Face recognition and analysis
