How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus
Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo,, Heuiseok Lim

TL;DR
This paper introduces a tool that leverages neural machine translation to efficiently build high-quality parallel corpora, reducing human effort and enhancing translation data quality.
Contribution
The paper presents a novel data-centric tool that combines NMT and human translation to improve efficiency in creating parallel corpora.
Findings
Reduces human labor in corpus construction
Improves data quality through combined NMT and human input
Tool is publicly available for broader use
Abstract
This paper proposes a tool for efficiently constructing high-quality parallel corpora with minimizing human labor and making this tool publicly available. Our proposed construction process is based on neural machine translation (NMT) to allow for it to not only coexist with human translation, but also improve its efficiency by combining data quality control with human translation in a data-centric approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
