Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Peng Wang, Zhaohai Li, Jun Tang, Humen Zhong, Fei Huang, Zhibo Yang,, Cong Yao

TL;DR
Platypus is a unified model that effectively recognizes various forms of text from images, combining the strengths of specialist and generalist approaches for improved accuracy and efficiency.
Contribution
The paper introduces Platypus, a generalized specialist model that unifies multiple text reading tasks with high accuracy and efficiency, and provides a new dataset called Worms for training and evaluation.
Findings
Platypus outperforms existing models on standard benchmarks.
The Worms dataset enhances training for diverse text recognition.
Platypus achieves a balance of accuracy and efficiency in multi-form text reading.
Abstract
Reading text from images (either natural scenes or documents) has been a long-standing research topic for decades, due to the high technical challenge and wide application range. Previously, individual specialist models are developed to tackle the sub-tasks of text reading (e.g., scene text recognition, handwritten text recognition and mathematical expression recognition). However, such specialist models usually cannot effectively generalize across different sub-tasks. Recently, generalist models (such as GPT-4V), trained on tremendous data in a unified way, have shown enormous potential in reading text in various scenarios, but with the drawbacks of limited accuracy and low efficiency. In this work, we propose Platypus, a generalized specialist model for text reading. Specifically, Platypus combines the best of both worlds: being able to recognize text of various forms with a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Mathematics, Computing, and Information Processing
