Enhancing Energy Minimization Framework for Scene Text Recognition with   Top-Down Cues

Anand Mishra; Karteek Alahari; C. V. Jawahar

arXiv:1601.03128·cs.CV·March 24, 2016

Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

Anand Mishra, Karteek Alahari, C. V. Jawahar

PDF

TL;DR

This paper enhances scene text recognition by integrating bottom-up character detections with top-down language cues within an energy minimization framework, achieving improved accuracy on multiple benchmarks.

Contribution

It introduces a novel energy minimization model that combines character detection scores with lexicon-based language priors for scene text recognition.

Findings

01

Outperforms comparable methods on multiple datasets

02

Integrating CNN features further improves accuracy

03

Rigorous analysis validates each step of the approach

Abstract

Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.