# The Enhance-Fuse-Align Principle: A New Architectural Blueprint for Robust Object Detection, with Application to X-Ray Security

**Authors:** Yuduo Lin, Yanfeng Lin, Heng Wu, Ming Wu

PMC · DOI: 10.3390/s25216603 · 2025-10-27

## TL;DR

This paper introduces a new architecture for object detection in X-ray security imaging that improves performance by enhancing features before fusing and aligning them.

## Contribution

The Enhance-Fuse-Align principle is introduced as a novel architectural blueprint for robust object detection in noisy and ambiguous imaging domains.

## Key findings

- SecureDet, implementing the E-F-A principle, outperforms baseline and improperly ordered architectures in X-ray contraband detection.
- Applying enhancement before fusion reduces noise amplification during cross-scale aggregation.
- Final alignment modules correct mis-registrations caused by occluding materials.

## Abstract

Object detection in challenging imaging domains like security screening, medical analysis, and satellite imaging is often hindered by signal degradation (e.g., noise, blur) and spatial ambiguity (e.g., occlusion, extreme scale variation). We argue that many standard architectures fail by fusing multi-scale features prematurely, which amplifies noise. This paper introduces the Enhance-Fuse-Align (E-F-A) principle: a new architectural blueprint positing that robust feature enhancement and explicit spatial alignment are necessary preconditions for effective feature fusion. We implement this blueprint in a model named SecureDet, which instantiates each stage: (1) an RFCBAMConv module for feature Enhancement; (2) a BiFPN for weighted Fusion; (3) ECFA and ASFA modules for contextual and spatial Alignment. To validate the E-F-A blueprint, we apply SecureDet to the highly challenging task of X-ray contraband detection. Extensive experiments and ablation studies demonstrate that the mandated E-F-A sequence is critical to performance, significantly outperforming both the baseline and incomplete or improperly ordered architectures. In practice, enhancement is applied prior to fusion to attenuate noise and blur that would otherwise be amplified by cross-scale aggregation, and final alignment corrects mis-registrations to avoid sampling extraneous signals from occluding materials.

## Full-text entities

- **Diseases:** PSP (MESH:D008569), Tumor (MESH:D009369), injury to (MESH:D014947)
- **Chemicals:** ASFA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12610471/full.md

---
Source: https://tomesphere.com/paper/PMC12610471