TL;DR
This paper introduces CAL, a novel active learning method that selects contrastive examples by combining uncertainty and diversity, leading to improved performance across multiple NLP tasks and datasets.
Contribution
The paper proposes CAL, an active learning acquisition function that leverages contrastive examples, combining uncertainty and diversity for better data selection.
Findings
CAL outperforms or matches baseline methods across all tasks.
CAL achieves a better trade-off between uncertainty and diversity.
Extensive ablation confirms the effectiveness of contrastive example selection.
Abstract
Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively. In this work, leveraging the best of both worlds, we propose an acquisition function that opts for selecting \textit{contrastive examples}, i.e. data points that are similar in the model feature space and yet the model outputs maximally different predictive likelihoods. We compare our approach, CAL (Contrastive Active Learning), with a diverse set of acquisition functions in four natural language understanding tasks and seven datasets. Our experiments show that CAL performs consistently better or equal than the best performing baseline across all tasks, on both in-domain and out-of-domain data. We also conduct an extensive ablation study of our method and we further analyze all actively acquired…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
