Translate Smart, not Hard: Cascaded Translation Systems with   Quality-Aware Deferral

Ant\'onio Farinhas; Nuno M. Guerreiro; Sweta Agrawal; Ricardo Rei,; Andr\'e F.T. Martins

arXiv:2502.12701·cs.CL·February 19, 2025

Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral

Ant\'onio Farinhas, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei,, Andr\'e F.T. Martins

PDF

Open Access 1 Video

TL;DR

This paper introduces a cascaded machine translation system that uses quality estimation metrics to selectively defer difficult instances to larger models, achieving high performance with reduced computational costs.

Contribution

It proposes a simple, QE-based deferral method for cascaded translation systems, improving efficiency while maintaining translation quality.

Findings

01

QE-based deferral matches larger model performance

02

Reduces model invocation to 30-50% of cases

03

Validated through automatic and human evaluations

Abstract

Larger models often outperform smaller ones but come with high computational costs. Cascading offers a potential solution. By default, it uses smaller models and defers only some instances to larger, more powerful models. However, designing effective deferral rules remains a challenge. In this paper, we propose a simple yet effective approach for machine translation, using existing quality estimation (QE) metrics as deferral rules. We show that QE-based deferral allows a cascaded system to match the performance of a larger model while invoking it for a small fraction (30% to 50%) of the examples, significantly reducing computational costs. We validate this approach through both automatic and human evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral· underline

Taxonomy

TopicsNatural Language Processing Techniques