Active Learning for New Domains in Natural Language Understanding

Stanislav Peshterliev; John Kearney; Abhyuday Jagannatha; Imre Kiss,; Spyros Matsoukas

arXiv:1810.03450·cs.CL·April 2, 2019·1 cites

Active Learning for New Domains in Natural Language Understanding

Stanislav Peshterliev, John Kearney, Abhyuday Jagannatha, Imre Kiss,, Spyros Matsoukas

PDF

Open Access

TL;DR

This paper introduces Majority-CRF, an active learning algorithm that enhances NLU system accuracy for new domains by intelligently selecting utterances, resulting in significant error reduction and improved system performance.

Contribution

The paper presents a novel ensemble-based active learning method, Majority-CRF, tailored for domain adaptation in NLU systems, outperforming existing approaches.

Findings

01

Achieves 6.6%-9% relative error reduction over random sampling.

02

Statistically significant improvements over other active learning methods.

03

Case studies show 4.6%-9% improvement with human-in-the-loop AL.

Abstract

We explore active learning (AL) for improving the accuracy of new domains in a natural language understanding (NLU) system. We propose an algorithm called Majority-CRF that uses an ensemble of classification models to guide the selection of relevant utterances, as well as a sequence labeling model to help prioritize informative examples. Experiments with three domains show that Majority-CRF achieves 6.6%-9% relative error rate reduction compared to random sampling with the same annotation budget, and statistically significant improvements compared to other AL approaches. Additionally, case studies with human-in-the-loop AL on six new domains show 4.6%-9% improvement on an existing NLU system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Natural Language Processing Techniques · Topic Modeling