Neural Networks Classifier for Data Selection in Statistical Machine   Translation

\'Alvaro Peris; Mara Chinea-Rios; Francisco Casacuberta

arXiv:1612.05555·cs.CL·December 22, 2016

Neural Networks Classifier for Data Selection in Statistical Machine Translation

\'Alvaro Peris, Mara Chinea-Rios, Francisco Casacuberta

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural network classifier for data selection in statistical machine translation, demonstrating improved translation quality over existing methods across multiple language pairs.

Contribution

The paper presents a novel neural network-based data selection method for SMT, outperforming the state-of-the-art cross-entropy approach.

Findings

01

Better translation quality than cross-entropy method

02

Consistent results across different language pairs

03

Empirical validation of the neural classifier's effectiveness

Abstract

We address the data selection problem in statistical machine translation (SMT) as a classification task. The new data selection method is based on a neural network classifier. We present a new method description and empirical results proving that our data selection method provides better translation quality, compared to a state-of-the-art method (i.e., Cross entropy). Moreover, the empirical results reported are coherent across different language pairs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lvapeab/sentence-selectioNN
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Data Mining Algorithms and Applications