Info-Coevolution: An Efficient Framework for Data Model Coevolution
Ziheng Qin, Hailun Xu, Wei Chee Yew, Qi Jia, Yang Luo, Kanchan Sarkar, Danhui Guan, Kai Wang, Yang You

TL;DR
Info-Coevolution is a framework that enables models and data to coevolve through online selective annotation, significantly reducing annotation and training costs without sacrificing performance, by intelligently selecting data for annotation.
Contribution
The paper introduces a novel coevolution framework for data and models that improves dataset efficiency and reduces annotation costs without bias, using online selective annotation.
Findings
Reduces annotation and training costs by 32% on ImageNet-1K
Automatically determines optimal saving ratio without tuning
Further reduces annotation ratio to 50% with semi-supervised learning
Abstract
Machine learning relies heavily on data, yet the continuous growth of real-world data poses challenges for efficient dataset construction and training. A fundamental yet unsolved question is: given our current model and data, does a new data (sample/batch) need annotation/learning? Conventional approaches retain all available data, leading to non-optimal data and training efficiency. Active learning aims to reduce data redundancy by selecting a subset of samples to annotate, while it increases pipeline complexity and introduces bias. In this work, we propose Info-Coevolution, a novel framework that efficiently enables models and data to coevolve through online selective annotation with no bias. Leveraging task-specific models (and open-source models), it selectively annotates and integrates online and web data to improve datasets efficiently. For real-world datasets like ImageNet-1K,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Stream Mining Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
