Task Grouping for Multilingual Text Recognition

Jing Huang; Kevin J Liang; Rama Kovvuri; Tal Hassner

arXiv:2210.07423·cs.CV·October 17, 2022

Task Grouping for Multilingual Text Recognition

Jing Huang, Kevin J Liang, Rama Kovvuri, Tal Hassner

PDF

Open Access 1 Repo

TL;DR

This paper introduces an automatic task grouping method for multilingual text recognition, leveraging language similarities to improve OCR accuracy through joint training and dynamic grouping.

Contribution

It proposes a novel Gumbel-Softmax based task grouping module with a specialized loss, enabling adaptive language grouping for better OCR performance.

Findings

01

Joint training with task grouping improves recognition accuracy.

02

Optimal language groupings outperform full separation or full sharing.

03

Experimental results on MLT19 support the effectiveness of the proposed method.

Abstract

Most existing OCR methods focus on alphanumeric characters due to the popularity of English and numbers, as well as their corresponding datasets. On extending the characters to more languages, recent methods have shown that training different scripts with different recognition heads can greatly improve the end-to-end recognition accuracy compared to combining characters from all languages in the same recognition head. However, we postulate that similarities between some languages could allow sharing of model parameters and benefit from joint training. Determining language groupings, however, is not immediately obvious. To this end, we propose an automatic method for multilingual text recognition with a task grouping and assignment module using Gumbel-Softmax, introducing a task grouping loss and weighted recognition loss to allow for simultaneous training of the models and grouping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/MultiplexedOCR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Speech Recognition and Synthesis