Generating Risky Samples with Conformity Constraints via Diffusion Models

Han Yu; Hao Zou; Xingxuan Zhang; Zhengyi Wang; Yue He; Kehan Li; Peng Cui

arXiv:2512.18722·cs.LG·December 23, 2025

Generating Risky Samples with Conformity Constraints via Diffusion Models

Han Yu, Hao Zou, Xingxuan Zhang, Zhengyi Wang, Yue He, Kehan Li, Peng Cui

PDF

Open Access

TL;DR

RiskyDiff is a diffusion-based method that generates risky samples with high category conformity by using embedding constraints and scoring, improving diversity and application safety.

Contribution

The paper introduces RiskyDiff, a novel diffusion model that incorporates embedding constraints and conformity scoring to generate high-risk, category-conforming samples beyond dataset coverage.

Findings

01

RiskyDiff outperforms existing methods in risk level and quality.

02

It enhances model generalization by augmenting training data with conforming risky samples.

03

The approach effectively balances risk generation and category conformity.

Abstract

Although neural networks achieve promising performance in many tasks, they may still fail when encountering some examples and bring about risks to applications. To discover risky samples, previous literature attempts to search for patterns of risky samples within existing datasets or inject perturbation into them. Yet in this way the diversity of risky samples is limited by the coverage of existing datasets. To overcome this limitation, recent works adopt diffusion models to produce new risky samples beyond the coverage of existing datasets. However, these methods struggle in the conformity between generated samples and expected categories, which could introduce label noise and severely limit their effectiveness in applications. To address this issue, we propose RiskyDiff that incorporates the embeddings of both texts and images as implicit constraints of category conformity. We also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis