Task Grouping for Multilingual Text Recognition
Jing Huang, Kevin J Liang, Rama Kovvuri, Tal Hassner

TL;DR
This paper introduces an automatic task grouping method for multilingual text recognition, leveraging language similarities to improve OCR accuracy through joint training and dynamic grouping.
Contribution
It proposes a novel Gumbel-Softmax based task grouping module with a specialized loss, enabling adaptive language grouping for better OCR performance.
Findings
Joint training with task grouping improves recognition accuracy.
Optimal language groupings outperform full separation or full sharing.
Experimental results on MLT19 support the effectiveness of the proposed method.
Abstract
Most existing OCR methods focus on alphanumeric characters due to the popularity of English and numbers, as well as their corresponding datasets. On extending the characters to more languages, recent methods have shown that training different scripts with different recognition heads can greatly improve the end-to-end recognition accuracy compared to combining characters from all languages in the same recognition head. However, we postulate that similarities between some languages could allow sharing of model parameters and benefit from joint training. Determining language groupings, however, is not immediately obvious. To this end, we propose an automatic method for multilingual text recognition with a task grouping and assignment module using Gumbel-Softmax, introducing a task grouping loss and weighted recognition loss to allow for simultaneous training of the models and grouping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Speech Recognition and Synthesis
