Characterizing the Effects of Translation on Intertextuality using Multilingual Embedding Spaces
Hope McGovern, Hale Sirin, Tom Lippincott

TL;DR
This paper explores how multilingual embedding spaces can quantify the preservation of intertextuality in translations, revealing differences between human and machine translation effects on literary features.
Contribution
It introduces a new metric for analyzing intertextuality preservation and provides a comparative analysis of human versus machine translation impacts using biblical texts.
Findings
Human translations tend to amplify intertextuality.
Machine translations serve as a neutral baseline.
Quantitative metrics reveal differences in intertextuality preservation.
Abstract
Rhetorical devices are difficult to translate, but they are crucial to the translation of literary documents. We investigate the use of multilingual embedding spaces to characterize the preservation of intertextuality, one common rhetorical device, across human and machine translation. To do so, we use Biblical texts, which are both full of intertextual references and are highly translated works. We provide a metric to characterize intertextuality at the corpus level and provide a quantitative analysis of the preservation of this rhetorical device across extant human translations and machine-generated counterparts. We go on to provide qualitative analysis of cases wherein human translations over- or underemphasize the intertextuality present in the text, whereas machine translations provide a neutral baseline. This provides support for established scholarship proposing that human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Subtitles and Audiovisual Media · Translation Studies and Practices
