Task Conditioned BERT for Joint Intent Detection and Slot-filling

Diogo Tavares; Pedro Azevedo; David Semedo; Ricardo Sousa and; Jo\~ao Magalh\~aes

arXiv:2308.06165·cs.CL·August 14, 2023

Task Conditioned BERT for Joint Intent Detection and Slot-filling

Diogo Tavares, Pedro Azevedo, David Semedo, Ricardo Sousa and, Jo\~ao Magalh\~aes

PDF

Open Access

TL;DR

This paper introduces a unified Transformer-based model conditioned on multiple dialogue inference tasks, improving joint intent detection and slot-filling performance in dialogue systems, especially on complex and real-world data.

Contribution

It proposes a novel conditioned BERT model that leverages multiple dialogue inference tasks simultaneously, enhancing transfer learning and performance in intent and slot detection.

Findings

01

Improved joint intent and slot detection by up to 14.4% on MultiWOZ.

02

Conditioning on multiple tasks enhances language interaction learning.

03

High performance maintained in real-world customer dialogues.

Abstract

Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks. The proposed principled model is based on a Transformer encoder, trained on multiple tasks, and leveraged by a rich input that conditions the model on the target inferences. Conditioning the Transformer encoder on multiple target inferences over the same corpus, i.e., intent and multiple slot types, allows learning richer language interactions than a single-task model would be able to. In fact, experimental results demonstrate that conditioning the model on an increasing number of dialogue inference tasks leads to improved results: on the MultiWOZ dataset, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Context-Aware Activity Recognition Systems

MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Adam · Label Smoothing · Refunds@Expedia|||How do I get a full refund from Expedia? · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Dense Connections