TL;DR
This paper explores methods for automatically generating AMR annotations to improve multilingual AMR-to-text generation, demonstrating that combining gold and silver data enhances performance across multiple languages.
Contribution
It introduces techniques for generating AMR annotations from different sources and shows that combining gold and silver data outperforms existing approaches.
Findings
Models trained on gold AMR with silver sentences outperform those using only silver AMR.
Combining gold and silver sources further improves multilingual AMR-to-text results.
The approach surpasses previous state-of-the-art in German, Italian, Spanish, and Chinese.
Abstract
Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR. However, this assumes a high quality of generated AMRs, potentially limiting the transferability to the target task. In this paper, we investigate different techniques for automatically generating AMR annotations, where we aim to study which source of information yields better multilingual results. Our models trained on gold AMR with silver (machine translated) sentences outperform approaches which leverage generated silver AMR. We find that combining both complementary sources of information further improves multilingual AMR-to-text generation. Our models surpass the previous state of the art for German, Italian, Spanish, and Chinese by a large margin.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
