# C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Car Damage Detection

**Authors:** Abdellah Zakaria Sellam, Ilyes Benaissa, Salah Eddine Bekhouche, Abdenour Hadid, Vito Ren\'o, Cosimo Distante

arXiv: 2509.00578 · 2025-10-28

## TL;DR

C-DiffDet+ introduces a novel context-aware fusion approach that combines global scene information with local features using cross-attention, significantly improving fine-grained vehicle damage detection performance.

## Contribution

The paper presents Context-Aware Fusion (CAF), a new method integrating global scene context into diffusion-based object detection, addressing local feature limitations in complex environments.

## Key findings

- Achieved state-of-the-art results on CarDD benchmark.
- Enhanced detection accuracy in fine-grained damage assessment.
- Demonstrated effectiveness of global context integration in diffusion models.

## Abstract

Fine-grained object detection in challenging visual domains, such as vehicle damage assessment, presents a formidable challenge even for human experts to resolve reliably. While DiffusionDet has advanced the state-of-the-art through conditional denoising diffusion, its performance remains limited by local feature conditioning in context-dependent scenarios. We address this fundamental limitation by introducing Context-Aware Fusion (CAF), which leverages cross-attention mechanisms to integrate global scene context with local proposal features directly. The global context is generated using a separate dedicated encoder that captures comprehensive environmental information, enabling each object proposal to attend to scene-level understanding. Our framework significantly enhances the generative detection paradigm by enabling each object proposal to attend to comprehensive environmental information. Experimental results demonstrate an improvement over state-of-the-art models on the CarDD benchmark, establishing new performance benchmarks for context-aware object detection in fine-grained domains

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00578/full.md

## Figures

51 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00578/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/2509.00578/full.md

---
Source: https://tomesphere.com/paper/2509.00578