Removing Biases from Trainable MT Metrics by Using Self-Training

Milo\v{s} Stanojevi\'c

arXiv:1508.02445·cs.CL·August 12, 2015·1 cites

Removing Biases from Trainable MT Metrics by Using Self-Training

Milo\v{s} Stanojevi\'c

PDF

Open Access

TL;DR

This paper introduces a self-training method to reduce biases in trainable machine translation metrics, enabling better domain adaptation without manual weight tuning, leading to more accurate translation quality assessments.

Contribution

It proposes a general self-training approach that mitigates biases in MT metrics without requiring feature-specific knowledge or manual weight adjustments.

Findings

01

Reduces length bias in MT metrics

02

Improves correlation with human judgments

03

Enables domain adaptation without manual tuning

Abstract

Most trainable machine translation (MT) metrics train their weights on human judgments of state-of-the-art MT systems outputs. This makes trainable metrics biases in many ways. One of them is preferring longer translations. These biased metrics when used for tuning are evaluating different types of translations -- n-best lists of translations with very diverse quality. Systems tuned with these metrics tend to produce overly long translations that are preferred by the metric but not by humans. This is usually solved by manually tweaking metric's weights to equally value recall and precision. Our solution is more general: (1) it does not address only the recall bias but also all other biases that might be present in the data and (2) it does not require any knowledge of the types of features used which is useful in cases when manual tuning of metric's weights is not possible. This is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research