PP-OCR: A Practical Ultra Lightweight OCR System
Yuning Du, Chenxia Li, Ruoyu Guo, Xiaoting Yin, Weiwei Liu, Jun Zhou,, Yifan Bai, Zilin Yu, Yehua Yang, Qingqing Dang, Haoshuang Wang

TL;DR
PP-OCR is a highly compact and efficient OCR system capable of recognizing multiple languages with a model size under 4MB, suitable for diverse practical applications.
Contribution
This paper introduces PP-OCR, a lightweight OCR system with strategies to significantly reduce model size while maintaining high recognition accuracy across multiple languages.
Findings
Model size is only 3.5MB for Chinese characters
Achieved high accuracy in multilingual OCR tasks
Open-sourced models and code for broad accessibility
Abstract
The Optical Character Recognition (OCR) systems have been widely used in various of application scenarios, such as office automation (OA) systems, factory automations, online educations, map productions etc. However, OCR is still a challenging task due to the various of text appearances and the demand of computational efficiency. In this paper, we propose a practical ultra lightweight OCR system, i.e., PP-OCR. The overall model size of the PP-OCR is only 3.5M for recognizing 6622 Chinese characters and 2.8M for recognizing 63 alphanumeric symbols, respectively. We introduce a bag of strategies to either enhance the model ability or reduce the model size. The corresponding ablation experiments with the real data are also provided. Meanwhile, several pre-trained models for the Chinese and English recognition are released, including a text detector (97K images are used), a direction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction
MethodsPP-OCR
