Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting
Mikel L. Forcada, Carolina Scarton, Lucia Specia, Barry Haddow,, Alexandra Birch

TL;DR
This study compares gap-filling and reading comprehension questionnaires for evaluating machine translation in gisting, finding gap-filling to be a cost-effective alternative with comparable system rankings.
Contribution
It provides the first systematic comparison of gap-filling and RCQ methods for MT evaluation, including analysis of variables affecting gap-filling effectiveness.
Findings
Both methods identify MT usefulness.
RCQ and GF rankings largely agree.
GF scores vary widely among informants.
Abstract
A popular application of machine translation (MT) is gisting: MT is consumed as is to make sense of text in a foreign language. Evaluation of the usefulness of MT for gisting is surprisingly uncommon. The classical method uses reading comprehension questionnaires (RCQ), in which informants are asked to answer professionally-written questions in their language about a foreign text that has been machine-translated into their language. Recently, gap-filling (GF), a form of cloze testing, has been proposed as a cheaper alternative to RCQ. In GF, certain words are removed from reference translations and readers are asked to fill the gaps left using the machine-translated text as a hint. This paper reports, for thefirst time, a comparative evaluation, using both RCQ and GF, of translations from multiple MT systems for the same foreign texts, and a systematic study on the effect of variables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Software Engineering Research · Topic Modeling
