Pre-trained Language Model Based Active Learning for Sentence Matching

Guirong Bai; Shizhu He; Kang Liu; Jun Zhao; Zaiqing Nie

arXiv:2010.05522·cs.CL·October 13, 2020

Pre-trained Language Model Based Active Learning for Sentence Matching

Guirong Bai, Shizhu He, Kang Liu, Jun Zhao, Zaiqing Nie

PDF

Open Access

TL;DR

This paper introduces a novel active learning method for sentence matching that leverages pre-trained language models to select more informative instances, reducing annotation costs and improving accuracy.

Contribution

It proposes a pre-trained language model based active learning approach that incorporates linguistic criteria for more effective instance selection in sentence matching.

Findings

01

Achieves higher accuracy with fewer labeled instances

02

Outperforms entropy-based active learning methods

03

Demonstrates efficiency in annotation process

Abstract

Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria to measure instances and help select more efficient instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Topic Modeling · Natural Language Processing Techniques