Text Localization in Video Using Multiscale Weber's Local Descriptor

B.H. Shekar; Smitha M.L.

arXiv:1504.03810·cs.CV·November 17, 2016

Text Localization in Video Using Multiscale Weber's Local Descriptor

B.H. Shekar, Smitha M.L.

PDF

TL;DR

This paper introduces a multiscale Weber's Local Descriptor-based method for detecting and localizing text in videos, effectively handling various text sizes, fonts, and colors through a sequence of image processing steps.

Contribution

It presents a novel multiscale WLD approach combined with morphological and connected component analysis for accurate video text localization.

Findings

01

Effective detection of texts of various sizes and fonts

02

High accuracy demonstrated on standard video datasets

03

Robust localization in complex video scenes

Abstract

In this paper, we propose a novel approach for detecting the text present in videos and scene images based on the Multiscale Weber's Local Descriptor (MWLD). Given an input video, the shots are identified and the key frames are extracted based on their spatio-temporal relationship. From each key frame, we detect the local region information using WLD with different radius and neighborhood relationship of pixel values and hence obtained intensity enhanced key frames at multiple scales. These multiscale WLD key frames are merged together and then the horizontal gradients are computed using morphological operations. The obtained results are then binarized and the false positives are eliminated based on geometrical properties. Finally, we employ connected component analysis and morphological dilation operation to determine the text regions that aids in text localization. The experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.