Automatic Removal of Marginal Annotations in Printed Text Document
Abdessamad Elboushaki, Rachida Hannane, P. Nagabhushan, Mohammed Javed

TL;DR
This paper presents a two-stage algorithm for automatically removing handwritten marginal annotations from printed documents, effectively recovering the original text with high accuracy without requiring the original document.
Contribution
A novel two-stage method combining marginal boundary detection and connected component analysis to remove annotations and restore original printed text.
Findings
89.01% accuracy in removing marginal annotations
97.74% accuracy in retrieving original printed text
Validated on 50 complex annotated documents
Abstract
Recovering the original printed texts from a document with added handwritten annotations in the marginal area is one of the challenging problems, especially when the original document is not available. Therefore, this paper aims at salvaging automatically the original document from the annotated document by detecting and removing any handwritten annotations that appear in the marginal area of the document without any loss of information. Here a two stage algorithm is proposed, where in the first stage due to approximate marginal boundary detection with horizontal and vertical projection profiles, all of the marginal annotations along with some part of the original printed text that may appear very close to the marginal boundary are removed. Therefore as a second stage, using the connected components, a strategy is applied to bring back the printed text components cropped during the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques · Vehicle License Plate Recognition
