FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models

Yi Sun; Zhiqi Zhang; Xinhao Zhong; Yimin Zhou; Shuoyang Sun; Bin Chen; Shu-Tao Xia; Ke Xu

arXiv:2605.19739·cs.CV·May 20, 2026

FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models

Yi Sun, Zhiqi Zhang, Xinhao Zhong, Yimin Zhou, Shuoyang Sun, Bin Chen, Shu-Tao Xia, Ke Xu

PDF

TL;DR

FlowErase-RL introduces a reward-based framework for concept erasure in flow matching models, enhancing safety and control in text-to-image generation without requiring supervised data.

Contribution

It reformulates concept erasure as a reward optimization problem using a dual-path mechanism, enabling scalable and effective multi-concept erasure without explicit supervision.

Findings

01

Achieves state-of-the-art erasure performance on various concepts.

02

Maintains high image quality and semantic alignment after erasure.

03

Demonstrates robustness against adversarial attacks and scalability to multiple concepts.

Abstract

Recent advances in flow matching models have significantly improved text-to-image generation quality, but also introduce growing safety risks due to the generation of harmful or undesirable content. Existing concept erasure methods are either inference-time interventions with limited effectiveness or rely on supervised fine-tuning (SFT), which requires precisely aligned data and struggles with scalability and multi-concept settings. In this paper, we propose \emph{FlowErase-RL}, the first GRPO-based framework for concept erasure in flow matching models. We reformulate concept erasure as a reward optimization problem and introduce a \textbf{dynamic dual-path reward mechanism} that jointly optimizes (i) a Concept Erasure (CE) reward to suppress target concepts and (ii) a Non-target Space (NS) reward to preserve generative fidelity. The two reward paths are adaptively balanced during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.