A Model for Fine-Grained Alignment of Multilingual Texts
Lea Cyrus, Hendrik Feddes

TL;DR
This paper introduces a predicate-argument structure-based model for fine-grained alignment of multilingual texts, effectively handling non-literal translations and bridging the gap between sentence and word alignment.
Contribution
It presents a novel alignment model that explicitly encodes non-literal translations using predicate-argument structures, enabling more precise multilingual text alignment.
Findings
Model effectively handles non-literal translations
Applied in English-German treebank project
Potential for extension to additional languages
Abstract
While alignment of texts on the sentential level is often seen as being too coarse, and word alignment as being too fine-grained, bi- or multilingual texts which are aligned on a level in-between are a useful resource for many purposes. Starting from a number of examples of non-literal translations, which tend to make alignment difficult, we describe an alignment model which copes with these cases by explicitly coding them. The model is based on predicate-argument structures and thus covers the middle ground between sentence and word alignment. The model is currently used in a recently initiated project of a parallel English-German treebank (FuSe), which can in principle be extended with additional languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
