Edit Distances and Their Applications to Downstream Tasks in Research and Commercial Contexts
F\'elix do Carmo, Diptesh Kanojia

TL;DR
This paper reviews various edit distance metrics, analyzing their components, limitations, and applications in research and commercial translation tasks, highlighting their impact on evaluating post-editing effort and translation quality.
Contribution
It dissects the components of edit distances, evaluates their effectiveness in real-world translation contexts, and discusses implications for commercial and research applications.
Findings
Edit distances are sensitive to their component choices and implementation.
Imperfect edit distances may misrepresent actual post-editing effort.
Commercial translation tools integrate edit distances, influencing translator rates.
Abstract
The tutorial describes the concept of edit distances applied to research and commercial contexts. We use Translation Edit Rate (TER), Levenshtein, Damerau-Levenshtein, Longest Common Subsequence and -gram distances to demonstrate the frailty of statistical metrics when comparing text sequences. Our discussion disassembles them into their essential components. We discuss the centrality of four editing actions: insert, delete, replace and move words, and show their implementations in openly available packages and toolkits. The application of edit distances in downstream tasks often assumes that these accurately represent work done by post-editors and real errors that need to be corrected in MT output. We discuss how imperfect edit distances are in capturing the details of this error correction work and the implications for researchers and for commercial applications, of these uses of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Teaching and Learning Methods · Usability and User Interface Design
