Discourse Structure in Machine Translation Evaluation

Shafiq Joty; Francisco Guzm\'an; Llu\'is M\`arquez; Preslav Nakov

arXiv:1710.01504·cs.CL·October 5, 2017

Discourse Structure in Machine Translation Evaluation

Shafiq Joty, Francisco Guzm\'an, Llu\'is M\`arquez, Preslav Nakov

PDF

Open Access

TL;DR

This paper investigates the use of sentence-level discourse structures, specifically RST parse trees, to enhance machine translation evaluation metrics, showing that discourse information improves correlation with human judgments.

Contribution

It introduces discourse-aware similarity measures based on RST trees and demonstrates their effectiveness in improving translation evaluation metrics.

Findings

01

Discourse information complements existing metrics.

02

Similarity of RST trees correlates with translation quality.

03

All aspects of RST trees are relevant for evaluation.

Abstract

In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research