DocVCE: Diffusion-based Visual Counterfactual Explanations for Document Image Classification

Saifullah Saifullah; Stefan Agne; Andreas Dengel; Sheraz Ahmed

arXiv:2508.04233·cs.CV·August 7, 2025

DocVCE: Diffusion-based Visual Counterfactual Explanations for Document Image Classification

Saifullah Saifullah, Stefan Agne, Andreas Dengel, Sheraz Ahmed

PDF

TL;DR

This paper introduces DocVCE, a diffusion-based method for generating visual counterfactual explanations to improve transparency in document image classification models, addressing interpretability challenges.

Contribution

It presents the first generative counterfactual explanation approach for document image analysis using diffusion models with hierarchical refinement.

Findings

01

Effective in generating plausible counterfactuals across multiple datasets and models

02

Outperforms existing feature-importance methods in interpretability

03

Provides insights into global features learned by classifiers

Abstract

As black-box AI-driven decision-making systems become increasingly widespread in modern document processing workflows, improving their transparency and reliability has become critical, especially in high-stakes applications where biases or spurious correlations in decision-making could lead to serious consequences. One vital component often found in such document processing workflows is document image classification, which, despite its widespread use, remains difficult to explain. While some recent works have attempted to explain the decisions of document image classification models through feature-importance maps, these maps are often difficult to interpret and fail to provide insights into the global features learned by the model. In this paper, we aim to bridge this research gap by introducing generative document counterfactuals that provide meaningful insights into the model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.