Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data
Massimo Nicosia, Zhongdi Qu, Yasemin Altun

TL;DR
This paper introduces the Translate-and-Fill method, a zero-shot data augmentation technique that enhances multilingual semantic parsing by generating synthetic training data using a sequence-to-sequence filler trained only on English.
Contribution
The paper presents a simplified, effective Translate-and-Fill approach that improves multilingual semantic parsing without relying on complex alignment, outperforming traditional methods.
Findings
TaF achieves competitive accuracy with traditional alignment-based systems.
The method effectively generalizes to multiple languages in zero-shot settings.
Data augmentation with TaF improves semantic parsing performance.
Abstract
While multilingual pretrained language models (LMs) fine-tuned on a single language have shown substantial cross-lingual task transfer capabilities, there is still a wide performance gap in semantic parsing tasks when target language supervision is available. In this paper, we propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parser. This method simplifies the popular Translate-Align-Project (TAP) pipeline and consists of a sequence-to-sequence filler model that constructs a full parse conditioned on an utterance and a view of the same parse. Our filler is trained on English data only but can accurately complete instances in other languages (i.e., translations of the English training utterances), in a zero-shot fashion. Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
