Active Learning and Multi-label Classification for Ellipsis and   Coreference Detection in Conversational Question-Answering

Quentin Brabant; Lina Maria Rojas-Barahona; Claire Gardent

arXiv:2207.03145·cs.CL·July 8, 2022

Active Learning and Multi-label Classification for Ellipsis and Coreference Detection in Conversational Question-Answering

Quentin Brabant, Lina Maria Rojas-Barahona, Claire Gardent

PDF

Open Access

TL;DR

This paper presents a multi-label classification approach using DistilBERT and active learning to automatically detect ellipsis and coreference phenomena in conversational question-answering, improving performance with limited labeled data.

Contribution

It introduces a novel multi-label classifier combined with active learning for detecting ellipsis and coreference in dialogue, addressing data scarcity issues.

Findings

01

Enhanced detection accuracy with active learning

02

Effective use of multi-label classification for linguistic phenomena

03

Improved performance on a manually labeled dataset

Abstract

In human conversations, ellipsis and coreference are commonly occurring linguistic phenomena. Although these phenomena are a mean of making human-machine conversations more fluent and natural, only few dialogue corpora contain explicit indications on which turns contain ellipses and/or coreferences. In this paper we address the task of automatically detecting ellipsis and coreferences in conversational question answering. We propose to use a multi-label classifier based on DistilBERT. Multi-label classification and active learning are employed to compensate the limited amount of labeled data. We show that these methods greatly enhance the performance of the classifier for detecting these phenomena on a manually labeled dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Linear Warmup With Linear Decay · Adam · Layer Normalization · Weight Decay · WordPiece · Softmax