Active Learning for Mention Detection: A Comparison of Sentence Selection Strategies
Nitin Madnani, Hongyan Jing, Nanda Kambhatla, Salim Roukos

TL;DR
This paper compares different sentence selection strategies for active learning in mention detection, showing that an effective strategy can significantly reduce labeled data needs while maintaining performance.
Contribution
It introduces and evaluates various sentence selection strategies, highlighting a confidence-based approach that outperforms random selection in active mention detection.
Findings
Reduces labeled data by over 50% compared to random selection.
Achieves same performance with only 42% of data for named mentions.
Confidence-based strategy outperforms other methods.
Abstract
We propose and compare various sentence selection strategies for active learning for the task of detecting mentions of entities. The best strategy employs the sum of confidences of two statistical classifiers trained on different views of the data. Our experimental results show that, compared to the random selection strategy, this strategy reduces the amount of required labeled training data by over 50% while achieving the same performance. The effect is even more significant when only named mentions are considered: the system achieves the same performance by using only 42% of the training data required by the random selection strategy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Topic Modeling · Natural Language Processing Techniques
