Automatic Detection of Omissions in Translations
I. Dan Melamed (University of Pennsylvania)

TL;DR
ADOMIT is a geometric analysis-based algorithm that automatically detects translation omissions without linguistic data, effectively identifying errors in bitext maps for quality control.
Contribution
It introduces a novel, language-independent method for omission detection in translations using only geometric analysis of bitext maps.
Findings
Successfully identified errors in a gold standard dataset.
Effective in detecting omissions despite current limitations in bitext mapping technology.
Provides a valuable tool for translation quality assurance.
Abstract
ADOMIT is an algorithm for Automatic Detection of OMIssions in Translations. The algorithm relies solely on geometric analysis of bitext maps and uses no linguistic information. This property allows it to deal equally well with omissions that do not correspond to linguistic units, such as might result from word-processing mishaps. ADOMIT has proven itself by discovering many errors in a hand-constructed gold standard for evaluating bitext mapping algorithms. Quantitative evaluation on simulated omissions showed that, even with today's poor bitext mapping technology, ADOMIT is a valuable quality control tool for translators and translation bureaus.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
