E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene   Text

Michal Bu\v{s}ta; Yash Patel; Jiri Matas

arXiv:1801.09919·cs.CV·December 7, 2018

E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text

Michal Bu\v{s}ta, Yash Patel, Jiri Matas

PDF

3 Repos

TL;DR

This paper introduces E2E-MLT, a fully end-to-end trainable multi-language scene text recognition method using a single FCN, achieving competitive results across multiple scripts without language-specific modules.

Contribution

It presents the first multi-language OCR for scene text using a unified, fully differentiable network, simplifying multi-language scene text recognition.

Findings

01

E2E-MLT achieves competitive performance across multiple languages.

02

The method simplifies multi-language scene text recognition with a single network.

03

Obtaining accurate multi-language annotations remains challenging.

Abstract

An end-to-end trainable (fully differentiable) method for multi-language scene text localization and recognition is proposed. The approach is based on a single fully convolutional network (FCN) with shared layers for both tasks. E2E-MLT is the first published multi-language OCR for scene text. While trained in multi-language setup, E2E-MLT demonstrates competitive performance when compared to other methods trained for English scene text alone. The experiments show that obtaining accurate multi-language multi-script annotations is a challenging problem.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.