Zone-based Keyword Spotting in Bangla and Devanagari Documents

Ayan Kumar Bhunia; Partha Pratim Roy; Umapada Pal

arXiv:1712.01434·cs.CV·December 6, 2017

Zone-based Keyword Spotting in Bangla and Devanagari Documents

Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal

PDF

TL;DR

This paper introduces a zone-based keyword spotting system for Bangla and Devanagari scripts that leverages HMM-based zone segmentation and a novel foreground-background feature to improve recognition accuracy.

Contribution

It proposes a new zone segmentation method using HMMs and a combined foreground-background feature for better keyword spotting in Indic scripts.

Findings

01

Significant performance improvement over traditional methods

02

HMM-based zone segmentation effectively isolates script zones

03

Combined foreground-background features outperform individual features

Abstract

In this paper we present a word spotting system in text lines for offline Indic scripts such as Bangla (Bengali) and Devanagari. Recently, it was shown that zone-wise recognition method improves the word recognition performance than conventional full word recognition system in Indic scripts. Inspired with this idea we consider the zone segmentation approach and use middle zone information to improve the traditional word spotting performance. To avoid the problem of zone segmentation using heuristic approach, we propose here an HMM based approach to segment the upper and lower zone components from the text line images. The candidate keywords are searched from a line without segmenting characters or words. Also, we propose a novel feature combining foreground and background information of text line images for keyword-spotting by character filler models. A significant improvement in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.