Self-distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Ziyin Zhang, Ning Lu, Minghui Liao, Yongshuai Huang, Cheng Li, Min, Wang, Wei Peng

TL;DR
This paper introduces DCTC, a self-distillation regularization for CTC loss in text recognition, improving accuracy by emphasizing individual character learning without extra modules or data.
Contribution
It proposes a novel self-distillation regularization scheme for CTC loss, enhancing character-level supervision in text recognition models.
Findings
DCTC improves accuracy by up to 2.6% on benchmarks.
The method requires no extra parameters or data.
It effectively addresses CTC's sequence-level optimization issue.
Abstract
Text recognition methods are gaining rapid development. Some advanced techniques, e.g., powerful modules, language models, and un- and semi-supervised learning schemes, consecutively push the performance on public benchmarks forward. However, the problem of how to better optimize a text recognition model from the perspective of loss functions is largely overlooked. CTC-based methods, widely used in practice due to their good balance between performance and inference speed, still grapple with accuracy degradation. This is because CTC loss emphasizes the optimization of the entire sequence target while neglecting to learn individual characters. We propose a self-distillation scheme for CTC-based model to address this issue. It incorporates a framewise regularization term in CTC loss to emphasize individual supervision, and leverages the maximizing-a-posteriori of latent alignment to solve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Handwritten Text Recognition Techniques
MethodsConnectionist Temporal Classification Loss
