R-PHOC: Segmentation-Free Word Spotting using CNN
Suman Ghosh, Ernest Valveny

TL;DR
This paper introduces R-PHOC, a CNN-based method for segmentation-free word spotting that embeds candidate regions into a space for efficient nearest neighbor search, improving state-of-the-art results.
Contribution
It presents a novel segmentation-free approach using CNNs and PHOC embeddings for word spotting, eliminating the need for explicit segmentation.
Findings
Outperforms current state-of-the-art on GW dataset
Operates directly on images without segmentation
Performs comparably to segmentation-based methods in some cases
Abstract
This paper proposes a region based convolutional neural network for segmentation-free word spotting. Our net- work takes as input an image and a set of word candidate bound- ing boxes and embeds all bounding boxes into an embedding space, where word spotting can be casted as a simple nearest neighbour search between the query representation and each of the candidate bounding boxes. We make use of PHOC embedding as it has previously achieved significant success in segmentation- based word spotting. Word candidates are generated using a simple procedure based on grouping connected components using some spatial constraints. Experiments show that R-PHOC which operates on images directly can improve the current state-of- the-art in the standard GW dataset and performs as good as PHOCNET in some cases designed for segmentation based word spotting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
