Genetic Programming for Document Segmentation and Region Classification   Using Discipulus

N. Priyadharshini; M.S. Vijaya

arXiv:1303.0460·cs.CV·March 5, 2013·1 cites

Genetic Programming for Document Segmentation and Region Classification Using Discipulus

N. Priyadharshini, M.S. Vijaya

PDF

Open Access

TL;DR

This paper presents a genetic programming-based method for automatic document segmentation and classification into regions like text, images, and tables, achieving high accuracy and reducing manual effort in data extraction.

Contribution

It introduces a novel approach using Discipulus for genetic programming to classify document regions with 97.5% accuracy, improving automation in document analysis.

Findings

01

Achieved 97.5% classification accuracy.

02

Used Run length smearing rule for segmentation.

03

Demonstrated effectiveness of genetic programming in document classification.

Abstract

Document segmentation is a method of rending the document into distinct regions. A document is an assortment of information and a standard mode of conveying information to others. Pursuance of data from documents involves ton of human effort, time intense and might severely prohibit the usage of data systems. So, automatic information pursuance from the document has become a big issue. It is been shown that document segmentation will facilitate to beat such problems. This paper proposes a new approach to segment and classify the document regions as text, image, drawings and table. Document image is divided into blocks using Run length smearing rule and features are extracted from every blocks. Discipulus tool has been used to construct the Genetic programming based classifier model and located 97.5% classification accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Smart Agriculture and AI · Image Retrieval and Classification Techniques