Understanding Cross-Language Transfer Improvements in Low-Resource HTR: The Role of Sequence Modeling

Sana Al-azzawi; Chang Liu; Nudrat Habib; Elisa Barney; Marcus Liwicki

arXiv:2605.05900·cs.CV·May 8, 2026

Understanding Cross-Language Transfer Improvements in Low-Resource HTR: The Role of Sequence Modeling

Sana Al-azzawi, Chang Liu, Nudrat Habib, Elisa Barney, Marcus Liwicki

PDF

TL;DR

This study investigates how sequence modeling, rather than shared visual features, enhances cross-language transfer in low-resource handwritten text recognition for Arabic-script languages, emphasizing the importance of contextual dependencies.

Contribution

It provides a controlled comparison showing sequence-level modeling is key to transfer improvements, beyond shared visual representations, in low-resource HTR.

Findings

01

CRNN models outperform CNN-only models in multi-script transfer scenarios.

02

Sequence-level modeling correlates with improved transfer performance.

03

Shared visual features alone are insufficient for effective cross-language transfer.

Abstract

Handwritten Text Recognition (HTR) for Arabic-script languages benefits from cross-language joint training under low-resource conditions, particularly when using CRNN-based models that combine convolutional encoders with sequence modeling. However, it remains unclear whether these improvements are better explained by shared visual representations or sequence-level dependencies. In this work, we conduct a controlled architectural study of line-level Arabic-script HTR, comparing CNN-only models with CTC decoding and CRNN models under identical single-script and multi-script training regimes. Experiments are performed on Arabic (KHATT), Urdu (NUST-UHWR), and Persian (PHTD) datasets under low-resource settings (K in {100, 500, 1000}). Our results show a clear divergence in transfer behavior: while CNN-only models exhibit limited or unstable improvements, CRNN models achieve better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.