CTRAN: CNN-Transformer-based Network for Natural Language Understanding
Mehrdad Rafiepour, Javad Salimi Sartakhti

TL;DR
This paper introduces CTRAN, a CNN-Transformer hybrid model for natural language understanding tasks, achieving state-of-the-art results in intent detection and slot-filling by combining BERT, convolutional layers, and aligned Transformer decoders.
Contribution
The paper presents a novel CNN-Transformer architecture for intent detection and slot-filling, with a unique aligned Transformer decoder for slot-filling, outperforming existing methods on ATIS and SNIPS datasets.
Findings
Surpasses state-of-the-art in slot-filling on ATIS and SNIPS.
Incorporating language models as word embeddings improves performance.
The aligned Transformer decoder effectively aligns output tags with input tokens.
Abstract
Intent-detection and slot-filling are the two main tasks in natural language understanding. In this study, we propose CTRAN, a novel encoder-decoder CNN-Transformer-based architecture for intent-detection and slot-filling. In the encoder, we use BERT, followed by several convolutional layers, and rearrange the output using window feature sequence. We use stacked Transformer encoders after the window feature sequence. For the intent-detection decoder, we utilize self-attention followed by a linear layer. In the slot-filling decoder, we introduce the aligned Transformer decoder, which utilizes a zero diagonal mask, aligning output tags with input tokens. We apply our network on ATIS and SNIPS, and surpass the current state-of-the-art in slot-filling on both datasets. Furthermore, we incorporate the language model as word embeddings, and show that this strategy yields a better result when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Linear Warmup With Linear Decay · Position-Wise Feed-Forward Layer · Attention Dropout · Weight Decay · Adam
