Self-Supervised Learning for Text Recognition: A Critical Survey

Carlos Penarrubia; Jose J. Valero-Mas; Jorge Calvo-Zaragoza

arXiv:2407.19889·cs.CV·June 6, 2025

Self-Supervised Learning for Text Recognition: A Critical Survey

Carlos Penarrubia, Jose J. Valero-Mas, Jorge Calvo-Zaragoza

PDF

TL;DR

This paper provides a comprehensive review of self-supervised learning techniques in text recognition, analyzing current methods, comparing results, and proposing standardizations to advance the field.

Contribution

It offers a critical survey of SSL methods for text recognition, consolidating diverse approaches and highlighting gaps for future research.

Findings

01

SSL methods have rapidly evolved for text recognition

02

Current methods lack standardization and comprehensive comparison

03

The survey identifies promising directions for future research

Abstract

Text Recognition (TR) refers to the research area that focuses on retrieving textual information from images, a topic that has seen significant advancements in the last decade due to the use of Deep Neural Networks (DNN). However, these solutions often necessitate vast amounts of manually labeled or synthetic data. Addressing this challenge, Self-Supervised Learning (SSL) has gained attention by utilizing large datasets of unlabeled data to train DNN, thereby generating meaningful and robust representations. Although SSL was initially overlooked in TR because of its unique characteristics, recent years have witnessed a surge in the development of SSL methods specifically for this field. This rapid development, however, has led to many methods being explored independently, without taking previous efforts in methodology or comparison into account, thereby hindering progress in the field…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need