A Semi-Supervised Deep Clustering Pipeline for Mining Intentions From Texts
Xinyu Chen, Ian Beaver

TL;DR
This paper introduces VIM, a semi-supervised deep clustering pipeline that leverages fine-tuned language models and community detection to efficiently mine user intentions from conversational texts, aiding IVA development.
Contribution
The paper presents a novel semi-supervised clustering pipeline combining fine-tuned language models, distributed k-NN, and community detection for intent mining from texts, with flexible clustering options.
Findings
BERT improves task-specific representations with as little as 0.5% labeled data.
Fine-tuned BERT outperforms state-of-the-art clustering with only 2.5% labeled data.
The pipeline enhances data analyst efficiency and accelerates IVA deployment.
Abstract
Mining the latent intentions from large volumes of natural language inputs is a key step to help data analysts design and refine Intelligent Virtual Assistants (IVAs) for customer service. To aid data analysts in this task we present Verint Intent Manager (VIM), an analysis platform that combines unsupervised and semi-supervised approaches to help analysts quickly surface and organize relevant user intentions from conversational texts. For the initial exploration of data we make use of a novel unsupervised and semi-supervised pipeline that integrates the fine-tuning of high performing language models, a distributed k-NN graph building method and community detection techniques for mining the intentions and topics from texts. The fine-tuning step is necessary because pre-trained language models cannot encode texts to efficiently surface particular clustering structures when the target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Recommender Systems and Techniques · Sentiment Analysis and Opinion Mining
Methodstravel james · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Dense Connections · Linear Warmup With Linear Decay · Residual Connection · Softmax
