Error Detection in Large-Scale Natural Language Understanding Systems Using Transformer Models
Rakesh Chada, Pradeep Natarajan, Darshan Fofadiya, Prathap Ramachandra

TL;DR
This paper presents a Transformer-based approach to detect domain classification errors in large-scale conversational AI systems, significantly improving error detection accuracy over baseline models.
Contribution
It introduces a novel multitask fine-tuning method combining utterance encodings and hypotheses, achieving higher F1 scores in error detection for large-scale systems.
Findings
Achieved 30% F1 score in error detection, outperforming baselines.
Ensembling models increased F1 score to 32.2%.
Effective detection of rare domain errors in large-scale AI systems.
Abstract
Large-scale conversational assistants like Alexa, Siri, Cortana and Google Assistant process every utterance using multiple models for domain, intent and named entity recognition. Given the decoupled nature of model development and large traffic volumes, it is extremely difficult to identify utterances processed erroneously by such systems. We address this challenge to detect domain classification errors using offline Transformer models. We combine utterance encodings from a RoBERTa model with the Nbest hypothesis produced by the production system. We then fine-tune end-to-end in a multitask setting using a small dataset of humanannotated utterances with domain classification errors. We tested our approach for detecting misclassifications from one domain that accounts for <0.5% of the traffic in a large-scale conversational AI system. Our approach achieves an F1 score of 30%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Attention Dropout · WordPiece · Tanh Activation · Sigmoid Activation
