Identifying Semantic Divergences in Parallel Text without Annotations

Yogarshi Vyas; Xing Niu; Marine Carpuat

arXiv:1803.11112·cs.CL·March 30, 2018

Identifying Semantic Divergences in Parallel Text without Annotations

Yogarshi Vyas, Xing Niu, Marine Carpuat

PDF

1 Repo

TL;DR

This paper introduces a neural model that automatically detects semantic divergences in parallel texts without manual annotations, improving translation quality assessment and aiding neural machine translation systems.

Contribution

It presents a novel deep neural approach for identifying semantic divergences in parallel sentences without requiring annotated data, outperforming surface feature-based models.

Findings

01

The neural model detects divergences more accurately than surface feature models.

02

Semantic divergence detection improves neural machine translation quality.

03

The approach works across different parallel corpora without manual annotations.

Abstract

Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation. We show that our semantic model detects divergences more accurately than models based on surface features derived from word alignments, and that these divergences matter for neural machine translation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yogarshi/SemDiverge
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.