Building the Intent Landscape of Real-World Conversational Corpora with Extractive Question-Answering Transformers
Jean-Philippe Corbeil, Mia Taige Li, Hadi Abdi Ghavidel

TL;DR
This paper introduces an unsupervised pipeline using extractive QA transformers and clustering to map and understand intents in noisy real-world conversational data, aiding natural language understanding applications.
Contribution
It presents a novel unsupervised method combining extractive QA and clustering to extract and organize intents from real-world dialogues, even in noisy data environments.
Findings
Achieved over 85% linguistic validation rate for intent spans.
Reconstructed intent schemes with an average recall of 94.3%.
Demonstrated generalization of ELECTRA model to dialogue understanding.
Abstract
For companies with customer service, mapping intents inside their conversational data is crucial in building applications based on natural language understanding (NLU). Nevertheless, there is no established automated technique to gather the intents from noisy online chats or voice transcripts. Simple clustering approaches are not suited to intent-sparse dialogues. To solve this intent-landscape task, we propose an unsupervised pipeline that extracts the intents and the taxonomy of intents from real-world dialogues. Our pipeline mines intent-span candidates with an extractive Question-Answering Electra model and leverages sentence embeddings to apply a low-level density clustering followed by a top-level hierarchical clustering. Our results demonstrate the generalization ability of an ELECTRA large model fine-tuned on the SQuAD2 dataset to understand dialogues. With the right prompting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
Methodstravel james · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay · Softmax
