Data Augmentation for Spoken Language Understanding via Joint Variational Generation
Kang Min Yoo, Youhyun Shin, Sang-goo Lee

TL;DR
This paper introduces a joint variational generation approach to synthesize fully annotated utterances, significantly improving spoken language understanding models by addressing data scarcity through high-quality synthetic data.
Contribution
It presents a novel generative architecture that jointly synthesizes annotated utterances using latent variable models, enhancing SLU performance across multiple datasets.
Findings
Synthetic data improves SLU model accuracy.
The approach outperforms existing data augmentation methods.
Statistical tests confirm significance of improvements.
Abstract
Data scarcity is one of the main obstacles of domain adaptation in spoken language understanding (SLU) due to the high cost of creating manually tagged SLU datasets. Recent works in neural text generative models, particularly latent variable models such as variational autoencoder (VAE), have shown promising results in regards to generating plausible and natural sentences. In this paper, we propose a novel generative architecture which leverages the generative power of latent variable models to jointly synthesize fully annotated utterances. Our experiments show that existing SLU models trained on the additional synthetic examples achieve performance gains. Our approach not only helps alleviate the data scarcity issue in the SLU task for many datasets but also indiscriminately improves language understanding performances for various SLU models, supported by extensive experiments and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSolana Customer Service Number +1-833-534-1729
