Self-Attention Networks for Intent Detection
Sevinj Yolchuyeva, G\'eza N\'emeth, B\'alint Gyires-T\'oth

TL;DR
This paper introduces a novel intent detection system combining self-attention networks with Bi-LSTM, leveraging transformer models and universal sentence encoders to improve accuracy across multiple NLP datasets.
Contribution
It presents a new intent detection approach that integrates SANs with Bi-LSTM and utilizes transformer and deep averaging network encoders for enhanced performance.
Findings
Improved accuracy over LSTM-based models on multiple datasets
Effective capture of long-range dependencies in intent detection
Demonstrated robustness across diverse NLP datasets
Abstract
Self-attention networks (SAN) have shown promising performance in various Natural Language Processing (NLP) scenarios, especially in machine translation. One of the main points of SANs is the strength of capturing long-range and multi-scale dependencies from the data. In this paper, we present a novel intent detection system which is based on a self-attention network and a Bi-LSTM. Our approach shows improvement by using a transformer model and deep averaging network-based universal sentence encoder compared to previous solutions. We evaluate the system on Snips, Smart Speaker, Smart Lights, and ATIS datasets by different evaluation metrics. The performance of the proposed model is compared with LSTM with the same datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout
