ESIE-BERT: Enriching Sub-words Information Explicitly with BERT for   Joint Intent Classification and SlotFilling

Yu Guo; Zhilong Xie; Xingyan Chen; Huangen Chen; Leilei Wang; Huaming; Du; Shaopeng Wei; Yu Zhao; Qing Li; Gang Wu

arXiv:2211.14829·cs.CL·February 3, 2023·1 cites

ESIE-BERT: Enriching Sub-words Information Explicitly with BERT for Joint Intent Classification and SlotFilling

Yu Guo, Zhilong Xie, Xingyan Chen, Huangen Chen, Leilei Wang, Huaming, Du, Shaopeng Wei, Yu Zhao, Qing Li, Gang Wu

PDF

Open Access

TL;DR

This paper introduces ESIE-BERT, a novel method that explicitly models sub-word features and sentence-level intent information to improve joint intent classification and slot filling in natural language understanding tasks.

Contribution

It proposes a sub-words attention adapter and an intent attention adapter to better utilize sub-word and sentence features in BERT-based models.

Findings

01

Significant improvement in slot filling F1 score on ATIS dataset (from 96.1 to 98.2)

02

Enhanced model performance on two benchmark datasets

03

Addresses sub-word mismatch issue in BERT for NLU tasks

Abstract

Natural language understanding (NLU) has two core tasks: intent classification and slot filling. The success of pre-training language models resulted in a significant breakthrough in the two tasks. One of the promising solutions called BERT can jointly optimize the two tasks. We note that BERT-based models convert each complex token into multiple sub-tokens by wordpiece algorithm, which generates a mismatch between the lengths of the tokens and the labels. This leads to BERT-based models do not do well in label prediction which limits model performance improvement. Many existing models can be compatible with this issue but some hidden semantic information is discarded in the fine-tuning process. We address the problem by introducing a novel joint method on top of BERT which explicitly models the multiple sub-tokens features after wordpiece tokenization, thereby contributing to the two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsEmirates Airlines Office in Dubai · Multi-Head Attention · Attention Is All You Need · Adapter · Linear Layer · Weight Decay · Residual Connection · Dense Connections · Layer Normalization · Linear Warmup With Linear Decay