Design of intelligent proofreading system for English translation based on CNN and BERT
Feijun Liu, Huifeng Wang, Kun Wang, Yizhen Wang

TL;DR
This paper introduces a hybrid CNN-BERT model for English translation proofreading that effectively detects and corrects errors, significantly improving accuracy and outperforming existing methods.
Contribution
A novel end-to-end hybrid CNN and BERT-based system for robust translation proofreading with integrated detection and correction modules.
Findings
Achieved 90% accuracy in error detection.
Attained 89.37% F1 score in proofreading.
Outperformed recent techniques by over 10%.
Abstract
Since automatic translations can contain errors that require substantial human post-editing, machine translation proofreading is essential for improving quality. This paper proposes a novel hybrid approach for robust proofreading that combines convolutional neural networks (CNN) with Bidirectional Encoder Representations from Transformers (BERT). In order to extract semantic information from phrases and expressions, CNN uses a variety of convolution kernel filters to capture local n-gram patterns. In the meanwhile, BERT creates context-rich representations of whole sequences by utilizing stacked bidirectional transformer encoders. Using BERT's attention processes, the integrated error detection component relates tokens to spot translation irregularities including word order problems and omissions. The correction module then uses parallel English-German alignment and GRU decoder models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Translation Studies and Practices
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Linear Warmup With Linear Decay · Attention Dropout · Softmax · Linear Layer · Dropout · Dense Connections · Attention Is All You Need · WordPiece
