Active Learning and Multi-label Classification for Ellipsis and Coreference Detection in Conversational Question-Answering
Quentin Brabant, Lina Maria Rojas-Barahona, Claire Gardent

TL;DR
This paper presents a multi-label classification approach using DistilBERT and active learning to automatically detect ellipsis and coreference phenomena in conversational question-answering, improving performance with limited labeled data.
Contribution
It introduces a novel multi-label classifier combined with active learning for detecting ellipsis and coreference in dialogue, addressing data scarcity issues.
Findings
Enhanced detection accuracy with active learning
Effective use of multi-label classification for linguistic phenomena
Improved performance on a manually labeled dataset
Abstract
In human conversations, ellipsis and coreference are commonly occurring linguistic phenomena. Although these phenomena are a mean of making human-machine conversations more fluent and natural, only few dialogue corpora contain explicit indications on which turns contain ellipses and/or coreferences. In this paper we address the task of automatically detecting ellipsis and coreferences in conversational question answering. We propose to use a multi-label classifier based on DistilBERT. Multi-label classification and active learning are employed to compensate the limited amount of labeled data. We show that these methods greatly enhance the performance of the classifier for detecting these phenomena on a manually labeled dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Linear Warmup With Linear Decay · Adam · Layer Normalization · Weight Decay · WordPiece · Softmax
