Automatic Post-Editing for Vietnamese
Thanh Vu, Dai Quoc Nguyen

TL;DR
This paper introduces a large-scale Vietnamese APE dataset and demonstrates neural MT models effectively improve translation quality through automatic and human evaluations.
Contribution
It presents the first large-scale Vietnamese APE dataset and applies neural MT models to enhance post-editing accuracy.
Findings
Neural MT models significantly improve Vietnamese translation quality.
The dataset contains 5 million sentence pairs for training.
Both automatic and human evaluations confirm the effectiveness.
Abstract
Automatic post-editing (APE) is an important remedy for reducing errors of raw translated texts that are produced by machine translation (MT) systems or software-aided translation. In this paper, we present a systematic approach to tackle the APE task for Vietnamese. Specifically, we construct the first large-scale dataset of 5M Vietnamese translated and corrected sentence pairs. We then apply strong neural MT models to handle the APE task, using our constructed dataset. Experimental results from both automatic and human evaluations show the effectiveness of the neural MT models in handling the Vietnamese APE task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
