Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts   for Zero-Shot Dialogue State Tracking

Qingyue Wang; Liang Ding; Yanan Cao; Yibing Zhan; Zheng Lin; Shi Wang,; Dacheng Tao; Li Guo

arXiv:2306.00434·cs.CL·June 2, 2023·1 cites

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang,, Dacheng Tao, Li Guo

PDF

Open Access

TL;DR

This paper introduces a novel zero-shot dialogue state tracking method that disentangles sample semantics and uses a mixture-of-experts approach, significantly improving performance without external data.

Contribution

It proposes a divide, conquer, and combine strategy that explicitly separates sample semantics and employs a mixture-of-experts mechanism for better zero-shot DST.

Findings

01

Achieves state-of-the-art zero-shot DST performance on MultiWOZ2.1

02

Uses only 10M trainable parameters

03

Significantly improves robustness and generalization

Abstract

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data. Existing works mainly study common data- or model-level augmentation methods to enhance the generalization but fail to effectively decouple the semantics of samples, limiting the zero-shot performance of DST. In this paper, we present a simple and effective "divide, conquer and combine" solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. Specifically, we divide the seen data into semantically independent subsets and train corresponding experts, the newly unseen samples are mapped and inferred with mixture-of-experts with our designed ensemble inference. Extensive experiments on MultiWOZ2.1 upon the T5-Adapter show our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Context-Aware Activity Recognition Systems · EEG and Brain-Computer Interfaces

MethodsDynamic Sparse Training · fail