Extracting Similar Questions From Naturally-occurring Business   Conversations

Xiliang Zhu; David Rossouw; Shayna Gardiner; Simon Corston-Oliver

arXiv:2206.01585·cs.CL·June 6, 2022

Extracting Similar Questions From Naturally-occurring Business Conversations

Xiliang Zhu, David Rossouw, Shayna Gardiner, Simon Corston-Oliver

PDF

Open Access

TL;DR

This paper identifies limitations of standard BERT embeddings in capturing question similarity in business conversations and proposes a tuned representation method with exemplars for better grouping and visualization.

Contribution

It introduces a novel tuning approach and exemplar-based method to improve question similarity detection in real-world business dialogue analysis.

Findings

01

Standard BERT embeddings have narrow distributions in business contexts.

02

Tuned representations with exemplars improve question grouping.

03

Enhanced visualization aids data exploration and coaching.

Abstract

Pre-trained contextualized embedding models such as BERT are a standard building block in many natural language processing systems. We demonstrate that the sentence-level representations produced by some off-the-shelf contextualized embedding models have a narrow distribution in the embedding space, and thus perform poorly for the task of identifying semantically similar questions in real-world English business conversations. We describe a method that uses appropriately tuned representations and a small set of exemplars to group questions of interest to business users in a visualization that can be used for data exploration or employee coaching.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Softmax · Layer Normalization · Attention Dropout · Adam · WordPiece · Residual Connection