Pattern-aware Data Augmentation for Query Rewriting in Voice Assistant Systems
Yunmo Chen, Sixing Lu, Fan Yang, Xiaojiang Huang, Xing Fan, Chenlei, Guo

TL;DR
This paper introduces a pattern-aware data augmentation framework for query rewriting in voice assistants, improving performance especially in low-resource settings by generating diverse rewrite candidates through a learned, controllable sequence-to-sequence approach.
Contribution
It presents a novel augmentation method that learns patterns from data and uses policy gradient optimization for controllable, diverse query rewriting generation.
Findings
Outperforms baseline QR models in experiments
Effective in low-resource domain scenarios
Generates diverse, pattern-aware rewrite candidates
Abstract
Query rewriting (QR) systems are widely used to reduce the friction caused by errors in a spoken language understanding pipeline. However, the underlying supervised models require a large number of labeled pairs, and these pairs are hard and costly to be collected. Therefore, We propose an augmentation framework that learns patterns from existing training pairs and generates rewrite candidates from rewrite labels inversely to compensate for insufficient QR training data. The proposed framework casts the augmentation problem as a sequence-to-sequence generation task and enforces the optimization process with a policy gradient technique for controllable rewarding. This approach goes beyond the traditional heuristics or rule-based augmentation methods and is not constrained to generate predefined patterns of swapping/replacing words. Our experimental results show its effectiveness compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
