NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement
Giordano Cicchetti, Danilo Comminiello

TL;DR
NAF-DPM introduces a fast, nonlinear activation-free diffusion model for document enhancement, significantly improving OCR accuracy by effectively restoring degraded documents while reducing inference time.
Contribution
The paper presents a novel diffusion probabilistic model with an activation-free network and a fast ODE solver, enhancing document restoration efficiency and OCR performance.
Findings
Achieves state-of-the-art pixel and perceptual similarity metrics.
Reduces character error rate in OCR transcriptions.
Demonstrates superior performance across multiple datasets.
Abstract
Real-world documents may suffer various forms of degradation, often resulting in lower accuracy in optical character recognition (OCR) systems. Therefore, a crucial preprocessing step is essential to eliminate noise while preserving text and key features of documents. In this paper, we propose NAF-DPM, a novel generative framework based on a diffusion probabilistic model (DPM) designed to restore the original quality of degraded documents. While DPMs are recognized for their high-quality generated images, they are also known for their large inference time. To mitigate this problem we provide the DPM with an efficient nonlinear activation-free (NAF) network and we employ as a sampler a fast solver of ordinary differential equations, which can converge in a few iterations. To better preserve text characters, we introduce an additional differentiable module based on convolutional recurrent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Music and Audio Processing · Text and Document Classification Technologies
MethodsDiffusion
