Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection
Alexander Podolskiy, Dmitry Lipin, Andrey Bout, Ekaterina, Artemova, Irina Piontkovskaya

TL;DR
This paper demonstrates that fine-tuning Transformer encoders combined with Mahalanobis distance significantly improves out-of-domain detection in intent classification, setting new state-of-the-art results without requiring out-of-domain training data.
Contribution
It introduces a novel approach using fine-tuned Transformer representations and Mahalanobis distance for out-of-domain detection, outperforming existing methods.
Findings
Mahalanobis distance with Transformer embeddings outperforms other methods.
Fine-tuning Transformer encoders improves in-domain representation homogeneity.
Proposed method achieves new state-of-the-art results on three datasets.
Abstract
Real-life applications, heavily relying on machine learning, such as dialog systems, demand out-of-domain detection methods. Intent classification models should be equipped with a mechanism to distinguish seen intents from unseen ones so that the dialog agent is capable of rejecting the latter and avoiding undesired behavior. However, despite increasing attention paid to the task, the best practices for out-of-domain intent detection have not yet been fully established. This paper conducts a thorough comparison of out-of-domain intent detection methods. We prioritize the methods, not requiring access to out-of-domain data during training, gathering of which is extremely time- and labor-consuming due to lexical and stylistic variation of user utterances. We evaluate multiple contextual encoders and methods, proven to be efficient, on three standard datasets for intent classification,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Byte Pair Encoding · Multi-Head Attention · Attention Is All You Need · Dropout · Softmax
