End-to-End Speech to Intent Prediction to improve E-commerce Customer   Support Voicebot in Hindi and English

Abhinav Goyal; Anupam Singh; Nikesh Garera

arXiv:2211.07710·cs.CL·May 31, 2023

End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English

Abhinav Goyal, Anupam Singh, Nikesh Garera

PDF

Open Access

TL;DR

This paper presents an end-to-end speech-to-intent model for bilingual customer support voicebots in Hindi and English, achieving better accuracy and efficiency than traditional pipeline approaches.

Contribution

The paper introduces a novel E2E speech-to-intent system leveraging pre-trained ASR models, reducing complexity and improving performance in bilingual customer support applications.

Findings

01

E2E model outperforms pipeline by ~27% in F1 score

02

Effective fine-tuning on small datasets

03

Simplifies deployment and reduces latency

Abstract

Automation of on-call customer support relies heavily on accurate and efficient speech-to-intent (S2I) systems. Building such systems using multi-component pipelines can pose various challenges because they require large annotated datasets, have higher latency, and have complex deployment. These pipelines are also prone to compounding errors. To overcome these challenges, we discuss an end-to-end (E2E) S2I model for customer support voicebot task in a bilingual setting. We show how we can solve E2E intent classification by leveraging a pre-trained automatic speech recognition (ASR) model with slight modification and fine-tuning on small annotated datasets. Experimental results show that our best E2E model outperforms a conventional pipeline by a relative ~27% on the F1 score.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Topic Modeling

MethodsIs Venmo Customer Support Available 24/7? How to Reach a Real Person