End-to-End Text Recognition with Hybrid HMM Maxout Models

Ouais Alsharif; Joelle Pineau

arXiv:1310.1811·cs.CV·October 8, 2013·ICLR·100 cites

End-to-End Text Recognition with Hybrid HMM Maxout Models

Ouais Alsharif, Joelle Pineau

PDF

Open Access

TL;DR

This paper presents an end-to-end text recognition system for natural scenes that combines Maxout neural networks with hybrid HMM models, achieving state-of-the-art accuracy on standard benchmarks.

Contribution

It introduces a novel integration of Maxout networks and hybrid HMM models for robust scene text recognition, outperforming previous methods.

Findings

01

Achieved top accuracy on ICDAR 2003 dataset

02

Outperformed existing methods on SVT benchmark

03

Built a highly tunable and accurate recognition system

Abstract

The problem of detecting and recognizing text in natural scenes has proved to be more challenging than its counterpart in documents, with most of the previous work focusing on a single part of the problem. In this work, we propose new solutions to the character and word recognition problems and then show how to combine these solutions in an end-to-end text-recognition system. We do so by leveraging the recently introduced Maxout networks along with hybrid HMM models that have proven useful for voice recognition. Using these elements, we build a tunable and highly accurate recognition system that beats state-of-the-art results on all the sub-problems for both the ICDAR 2003 and SVT benchmark datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Speech Recognition and Synthesis · Natural Language Processing Techniques

MethodsMaxout