PDNet: Semantic Segmentation integrated with a Primal-Dual Network for   Document binarization

Kalyan Ram Ayyalasomayajula; Filip Malmberg; Anders Brun

arXiv:1801.08694·stat.ML·May 18, 2018

PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

Kalyan Ram Ayyalasomayajula, Filip Malmberg, Anders Brun

PDF

1 Repo

TL;DR

This paper introduces PDNet, a deep neural network that combines semantic segmentation with a primal-dual approach to improve document binarization, especially for degraded historical documents, achieving state-of-the-art results on multiple datasets.

Contribution

The novel integration of a fully convolutional network with an unrolled primal-dual network for end-to-end training in document binarization.

Findings

01

Achieves state-of-the-art binarization on four datasets

02

Pre-training on synthetic data improves performance

03

Handles numerical instabilities in primal-dual training

Abstract

Binarization of digital documents is the task of classifying each pixel in an image of the document as belonging to the background (parchment/paper) or foreground (text/ink). Historical documents are often subjected to degradations, that make the task challenging. In the current work a deep neural network architecture is proposed that combines a fully convolutional network with an unrolled primal-dual network that can be trained end-to-end to achieve state of the art binarization on four out of seven datasets. Document binarization is formulated as an energy minimization problem. A fully convolutional neural network is trained for semantic segmentation of pixels that provides labeling cost associated with each pixel. This cost estimate is refined along the edges to compensate for any over or under estimation of the foreground class using a primal-dual approach. We provide necessary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

krayyalasomayajula/pdNet
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.