Page Stream Segmentation with Convolutional Neural Nets Combining   Textual and Visual Features

Gregor Wiedemann; Gerhard Heyer

arXiv:1710.03006·cs.CL·March 26, 2019·2 cites

Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features

Gregor Wiedemann, Gerhard Heyer

PDF

Open Access

TL;DR

This paper presents a novel CNN-based method combining visual and textual features for page stream segmentation, achieving state-of-the-art accuracy in separating scanned document streams into individual documents.

Contribution

Introduces a new CNN architecture that integrates image and text features for improved page stream segmentation accuracy.

Findings

01

Achieves up to 93% accuracy in page stream segmentation.

02

Outperforms previous methods, setting a new state-of-the-art.

03

Effective combination of visual and textual features enhances segmentation results.

Abstract

In recent years, (retro-)digitizing paper-based files became a major undertaking for private and public archives as well as an important task in electronic mailroom applications. As a first step, the workflow involves scanning and Optical Character Recognition (OCR) of documents. Preservation of document contexts of single page scans is a major requirement in this context. To facilitate workflows involving very large amounts of paper scans, page stream segmentation (PSS) is the task to automatically separate a stream of scanned images into multi-page documents. In a digitization project together with a German federal archive, we developed a novel approach based on convolutional neural networks (CNN) combining image and text features to achieve optimal document separation results. Evaluation shows that our PSS architecture achieves an accuracy up to 93 % which can be regarded as a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Handwritten Text Recognition Techniques · Image Retrieval and Classification Techniques