Taming Generative Synthetic Data for X-ray Prohibited Item Detection

Jialong Sun; Hongguang Zhu; Weizhe Liu; Yunda Sun; Renshuai Tao; Yunchao Wei

arXiv:2511.15299·cs.CV·November 20, 2025

Taming Generative Synthetic Data for X-ray Prohibited Item Detection

Jialong Sun, Hongguang Zhu, Weizhe Liu, Yunda Sun, Renshuai Tao, Yunchao Wei

PDF

Open Access 1 Models

TL;DR

This paper introduces Xsyn, a one-stage text-to-image synthesis method for generating high-quality X-ray security images to improve prohibited item detection, reducing labor costs and outperforming previous approaches.

Contribution

Xsyn is the first one-stage synthesis pipeline that enhances X-ray image generation without extra labor, using cross-attention refinement and background occlusion modeling.

Findings

01

Xsyn achieves 1.2% higher mAP than previous methods.

02

Synthetic images improve detection performance across datasets.

03

The method reduces labor costs in data augmentation.

Abstract

Training prohibited item detection models requires a large amount of X-ray security images, but collecting and annotating these images is time-consuming and laborious. To address data insufficiency, X-ray security image synthesis methods composite images to scale up datasets. However, previous methods primarily follow a two-stage pipeline, where they implement labor-intensive foreground extraction in the first stage and then composite images in the second stage. Such a pipeline introduces inevitable extra labor cost and is not efficient. In this paper, we propose a one-stage X-ray security image synthesis pipeline (Xsyn) based on text-to-image generation, which incorporates two effective strategies to improve the usability of synthetic images. The Cross-Attention Refinement (CAR) strategy leverages the cross-attention map from the diffusion model to refine the bounding box annotation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Pillow-1/Xsyn
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Multimodal Machine Learning Applications