Sentence Correction Based on Large-scale Language Modelling
Ji Wen

TL;DR
This paper presents a large-scale language model approach for sentence correction and text restoration, introducing new measurement and optimization techniques to efficiently recover missing words in large datasets.
Contribution
It introduces a novel measurement for missing word detection, a comprehensive candidate lexicon, and effective optimization methods to improve efficiency in sentence correction tasks.
Findings
Restores missing text with high accuracy
Reduces processing time to 3.6 seconds for 1000 sentences
Enhances efficiency of large-scale sentence correction
Abstract
With the further development of informatization, more and more data is stored in the form of text. There are some loss of text during their generation and transmission. The paper aims to establish a language model based on the large-scale corpus to complete the restoration of missing text. In this paper, we introduce a novel measurement to find the missing words, and a way of establishing a comprehensive candidate lexicon to insert the correct choice of words. The paper also introduces some effective optimization methods, which largely improve the efficiency of the text restoration and shorten the time of dealing with 1000 sentences into 3.6 seconds. \keywords{ language model, sentence correction, word imputation, parallel optimization
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Web Data Mining and Analysis
