Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach
Kathryn Baker, Michael Bloodgood, Chris Callison-Burch, Bonnie J., Dorr, Nathaniel W. Filardo, Lori Levin, Scott Miller, Christine Piatko

TL;DR
This paper introduces a semantically-informed syntactic framework for statistical machine translation that enhances translation quality by integrating semantic tags, achieving state-of-the-art results on Urdu-English translation, especially for low-resource languages.
Contribution
It presents a novel tree-grafting approach to incorporate semantic information into syntactic structures for machine translation, improving performance over baseline models.
Findings
Outperformed baseline models on NIST 2009 Urdu-English task
Achieved highest reported scores for the translation task
Demonstrated significant gains for low-resource, different word order languages
Abstract
We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation. Semantically enriched syntactic tags assigned to the target-language training texts improved translation quality. The resulting system significantly outperformed a linguistically naive baseline model (Hiero), and reached the highest scores yet reported on the NIST 2009 Urdu-English translation task. This finding supports the hypothesis (posed by many researchers in the MT community, e.g., in DARPA GALE) that both syntactic and semantic information are critical for improving translation quality---and further demonstrates that large gains can be achieved for low-resource languages with different word order than English.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
