CTRAN: CNN-Transformer-based Network for Natural Language Understanding

Mehrdad Rafiepour; Javad Salimi Sartakhti

arXiv:2303.10606·cs.CL·September 8, 2023·1 cites

CTRAN: CNN-Transformer-based Network for Natural Language Understanding

Mehrdad Rafiepour, Javad Salimi Sartakhti

PDF

Open Access 1 Repo

TL;DR

This paper introduces CTRAN, a CNN-Transformer hybrid model for natural language understanding tasks, achieving state-of-the-art results in intent detection and slot-filling by combining BERT, convolutional layers, and aligned Transformer decoders.

Contribution

The paper presents a novel CNN-Transformer architecture for intent detection and slot-filling, with a unique aligned Transformer decoder for slot-filling, outperforming existing methods on ATIS and SNIPS datasets.

Findings

01

Surpasses state-of-the-art in slot-filling on ATIS and SNIPS.

02

Incorporating language models as word embeddings improves performance.

03

The aligned Transformer decoder effectively aligns output tags with input tokens.

Abstract

Intent-detection and slot-filling are the two main tasks in natural language understanding. In this study, we propose CTRAN, a novel encoder-decoder CNN-Transformer-based architecture for intent-detection and slot-filling. In the encoder, we use BERT, followed by several convolutional layers, and rearrange the output using window feature sequence. We use stacked Transformer encoders after the window feature sequence. For the intent-detection decoder, we utilize self-attention followed by a linear layer. In the slot-filling decoder, we introduce the aligned Transformer decoder, which utilizes a zero diagonal mask, aligning output tags with input tokens. We apply our network on ATIS and SNIPS, and surpass the current state-of-the-art in slot-filling on both datasets. Furthermore, we incorporate the language model as word embeddings, and show that this strategy yields a better result when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rafiepour/CTran
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Linear Warmup With Linear Decay · Position-Wise Feed-Forward Layer · Attention Dropout · Weight Decay · Adam