Neural Networks Classifier for Data Selection in Statistical Machine Translation
\'Alvaro Peris, Mara Chinea-Rios, Francisco Casacuberta

TL;DR
This paper introduces a neural network classifier for data selection in statistical machine translation, demonstrating improved translation quality over existing methods across multiple language pairs.
Contribution
The paper presents a novel neural network-based data selection method for SMT, outperforming the state-of-the-art cross-entropy approach.
Findings
Better translation quality than cross-entropy method
Consistent results across different language pairs
Empirical validation of the neural classifier's effectiveness
Abstract
We address the data selection problem in statistical machine translation (SMT) as a classification task. The new data selection method is based on a neural network classifier. We present a new method description and empirical results proving that our data selection method provides better translation quality, compared to a state-of-the-art method (i.e., Cross entropy). Moreover, the empirical results reported are coherent across different language pairs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Data Mining Algorithms and Applications
