Document image classification, with a specific view on applications of patent images
Gabriela Csurka

TL;DR
This paper conducts an extensive experimental study on document image classification and retrieval, focusing on parameter optimization for RL and FV representations across diverse datasets, including patent images, to guide effective feature selection.
Contribution
It provides comprehensive guidelines for parameter choices in RL and FV image representations, enhancing their applicability to various document image classification and retrieval tasks, especially patents.
Findings
Optimal parameters vary across datasets and tasks
RL and FV representations perform well with proper tuning
The same features can be used for classification and retrieval
Abstract
The main focus of this paper is document image classification and retrieval, where we analyze and compare different parameters for the RunLeght Histogram (RL) and Fisher Vector (FV) based image representations. We do an exhaustive experimental study using different document image datasets, including the MARG benchmarks, two datasets built on customer data and the images from the Patent Image Classification task of the Clef-IP 2011. The aim of the study is to give guidelines on how to best choose the parameters such that the same features perform well on different tasks. As an example of such need, we describe the Image-based Patent Retrieval task's of Clef-IP 2011, where we used the same image representation to predict the image type and retrieve relevant patents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCurrency Recognition and Detection · Text and Document Classification Technologies · Image Retrieval and Classification Techniques
