Adaptive Multi Scale Document Binarisation Using Vision Mamba
Mohd. Azfar, Siddhant Bharadwaj, and Ashwin Sasikumar

TL;DR
This paper introduces a Mamba-based architecture for document binarisation that efficiently processes long sequences with linear scaling and incorporates DoG features for enhanced detail, improving readability of historical documents.
Contribution
The work presents a novel Mamba-based model with multiscale DoG feature integration for efficient, high-quality document binarisation, addressing limitations of existing hybrid models.
Findings
Linear scaling handles long sequences efficiently
Incorporation of DoG features improves detail preservation
Achieves high-quality binarisation results
Abstract
Enhancing and preserving the readability of document images, particularly historical ones, is crucial for effective document image analysis. Numerous models have been proposed for this task, including convolutional-based, transformer-based, and hybrid convolutional-transformer architectures. While hybrid models address the limitations of purely convolutional or transformer-based methods, they often suffer from issues like quadratic time complexity. In this work, we propose a Mamba-based architecture for document binarisation, which efficiently handles long sequences by scaling linearly and optimizing memory usage. Additionally, we introduce novel modifications to the skip connections by incorporating Difference of Gaussians (DoG) features, inspired by conventional signal processing techniques. These multiscale high-frequency features enable the model to produce high-quality, detailed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction
