Entropy-Aware Structural Alignment for Zero-Shot Handwritten Chinese Character Recognition
Qiuming Luo, Tao Zeng, Feng Li, Heming Liu, Rui Mao, Chang Kong

TL;DR
This paper introduces an entropy-aware structural alignment network for zero-shot handwritten Chinese character recognition, leveraging hierarchical features and information-theoretic modeling to improve accuracy and data efficiency.
Contribution
It proposes a novel entropy-aware framework with a dual-view radical tree and semantic feature fusion, addressing limitations of flat radical sequence models.
Findings
Achieves 55.04% accuracy on ICDAR 2013 zero-shot setting.
Outperforms CLIP-based baselines significantly.
Reaches 92.41% accuracy with only one support sample per class.
Abstract
Zero-shot Handwritten Chinese Character Recognition (HCCR) aims to recognize unseen characters by leveraging radical-based semantic compositions. However, existing approaches often treat characters as flat radical sequences, neglecting the hierarchical topology and the uneven information density of different components. To address these limitations, we propose an Entropy-Aware Structural Alignment Network that bridges the visual-semantic gap through information-theoretic modeling. First, we introduce an Information Entropy Prior to dynamically modulate positional embeddings via multiplicative interaction, acting as a saliency detector that prioritizes discriminative roots over ubiquitous components. Second, we construct a Dual-View Radical Tree to extract multi-granularity structural features, which are integrated via an adaptive Sigmoid-based gating network to encode both global layout…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Multimodal Machine Learning Applications
