Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition
Bingcong Li, Xin Tang, Xianbiao Qi, Yihao Chen, Rong Xiao

TL;DR
Hamming OCR introduces a lightweight scene text recognition model using locality sensitive hashing to encode characters, significantly reducing model size while maintaining competitive accuracy across multiple languages and datasets.
Contribution
The paper proposes a novel Hamming classifier with LSH encoding and a simplified transformer decoder, enabling vocabulary size-independent model parameters for scene text recognition.
Findings
Achieves competitive accuracy on multiple datasets.
Reduces model size independently of vocabulary size.
Effective for Chinese and multilingual text recognition.
Abstract
Recently, inspired by Transformer, self-attention-based scene text recognition approaches have achieved outstanding performance. However, we find that the size of model expands rapidly with the lexicon increasing. Specifically, the number of parameters for softmax classification layer and output embedding layer are proportional to the vocabulary size. It hinders the development of a lightweight text recognition model especially applied for Chinese and multiple languages. Thus, we propose a lightweight scene text recognition model named Hamming OCR. In this model, a novel Hamming classifier, which adopts locality sensitive hashing (LSH) algorithm to encode each character, is proposed to replace the softmax regression and the generated LSH code is directly employed to replace the output embedding. We also present a simplified transformer decoder to reduce the number of parameters by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Handwritten Text Recognition Techniques · Vehicle License Plate Recognition
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Dropout · Byte Pair Encoding · Layer Normalization · Label Smoothing · Multi-Head Attention · Attention Is All You Need
