Pre-trained Language Model Based Active Learning for Sentence Matching
Guirong Bai, Shizhu He, Kang Liu, Jun Zhao, Zaiqing Nie

TL;DR
This paper introduces a novel active learning method for sentence matching that leverages pre-trained language models to select more informative instances, reducing annotation costs and improving accuracy.
Contribution
It proposes a pre-trained language model based active learning approach that incorporates linguistic criteria for more effective instance selection in sentence matching.
Findings
Achieves higher accuracy with fewer labeled instances
Outperforms entropy-based active learning methods
Demonstrates efficiency in annotation process
Abstract
Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria to measure instances and help select more efficient instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Topic Modeling · Natural Language Processing Techniques
