Video Text Localization using Wavelet and Shearlet Transforms
Purnendu Banerjee, B. B. Chaudhuri

TL;DR
This paper introduces a novel video text detection method combining wavelet and shearlet transforms to accurately identify text regions, leveraging their sparse representations for point-like and curve-like structures.
Contribution
It presents a new approach that combines wavelet and shearlet transforms with K-means clustering for improved video text localization.
Findings
Method outperforms existing techniques on standard datasets.
Effective detection of text regions in complex video backgrounds.
Utilizes combined features for robust text localization.
Abstract
Text in video is useful and important in indexing and retrieving the video documents efficiently and accurately. In this paper, we present a new method of text detection using a combined dictionary consisting of wavelets and a recently introduced transform called shearlets. Wavelets provide optimally sparse expansion for point-like structures and shearlets provide optimally sparse expansions for curve-like structures. By combining these two features we have computed a high frequency sub-band to brighten the text part. Then K-means clustering is used for obtaining text pixels from the Standard Deviation (SD) of combined coefficient of wavelets and shearlets as well as the union of wavelets and shearlets features. Text parts are obtained by grouping neighboring regions based on geometric properties of the classified output frame of unsupervised K-means classification. The proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
