Self-distillation Regularized Connectionist Temporal Classification Loss   for Text Recognition: A Simple Yet Effective Approach

Ziyin Zhang; Ning Lu; Minghui Liao; Yongshuai Huang; Cheng Li; Min; Wang; Wei Peng

arXiv:2308.08806·cs.CV·January 1, 2024

Self-distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Ziyin Zhang, Ning Lu, Minghui Liao, Yongshuai Huang, Cheng Li, Min, Wang, Wei Peng

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces DCTC, a self-distillation regularization for CTC loss in text recognition, improving accuracy by emphasizing individual character learning without extra modules or data.

Contribution

It proposes a novel self-distillation regularization scheme for CTC loss, enhancing character-level supervision in text recognition models.

Findings

01

DCTC improves accuracy by up to 2.6% on benchmarks.

02

The method requires no extra parameters or data.

03

It effectively addresses CTC's sequence-level optimization issue.

Abstract

Text recognition methods are gaining rapid development. Some advanced techniques, e.g., powerful modules, language models, and un- and semi-supervised learning schemes, consecutively push the performance on public benchmarks forward. However, the problem of how to better optimize a text recognition model from the perspective of loss functions is largely overlooked. CTC-based methods, widely used in practice due to their good balance between performance and inference speed, still grapple with accuracy degradation. This is because CTC loss emphasizes the optimization of the entire sequence target while neglecting to learn individual characters. We propose a self-distillation scheme for CTC-based model to address this issue. It incorporates a framewise regularization term in CTC loss to emphasize individual supervision, and leverages the maximizing-a-posteriori of latent alignment to solve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zzyhlyoko/DCTC
pytorchOfficial

Videos

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Handwritten Text Recognition Techniques

MethodsConnectionist Temporal Classification Loss