Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing
Deokhyung Kang, Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee

TL;DR
This paper introduces Cross-Lingual Back-Parsing (CBP), a novel data augmentation method that synthesizes target language utterances from meaning representations to improve zero-resource cross-lingual semantic parsing.
Contribution
The paper proposes CBP, a new approach leveraging mPLM geometry to generate target language data from source meaning representations, enhancing zero-shot cross-lingual transfer.
Findings
CBP significantly improves target language performance on SP benchmarks.
Synthesized utterances maintain high slot value alignment and semantic integrity.
Method is effective using only source labeled data and monolingual corpora.
Abstract
Recent efforts have aimed to utilize multilingual pretrained language models (mPLMs) to extend semantic parsing (SP) across multiple languages without requiring extensive annotations. However, achieving zero-shot cross-lingual transfer for SP remains challenging, leading to a performance gap between source and target languages. In this study, we propose Cross-Lingual Back-Parsing (CBP), a novel data augmentation methodology designed to enhance cross-lingual transfer for SP. Leveraging the representation geometry of the mPLMs, CBP synthesizes target language utterances from source meaning representations. Our methodology effectively performs cross-lingual data augmentation in challenging zero-resource settings, by utilizing only labeled data in the source language and monolingual corpora. Extensive experiments on two cross-language SP benchmarks (Mschema2QA and Xspider) demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
