FlowFixer: Towards Detail-Preserving Subject-Driven Generation

Jinyoung Jun; Won-Dong Jang; Wenbin Ouyang; Raghudeep Gadde; Jungbeom Lee

arXiv:2602.21402·cs.CV·March 2, 2026

FlowFixer: Towards Detail-Preserving Subject-Driven Generation

Jinyoung Jun, Won-Dong Jang, Wenbin Ouyang, Raghudeep Gadde, Jungbeom Lee

PDF

Open Access

TL;DR

FlowFixer is a novel framework that enhances subject-driven image generation by restoring fine details lost during transformation, using image-to-image translation and a new fidelity metric.

Contribution

It introduces a direct image-to-image translation approach with a self-supervised training scheme and a keypoint-based fidelity metric for improved detail preservation.

Findings

01

Outperforms existing SDG methods in quality and fidelity.

02

Introduces a new self-supervised training data generation scheme.

03

Proposes a keypoint matching metric for detailed fidelity assessment.

Abstract

We present FlowFixer, a refinement framework for subject-driven generation (SDG) that restores fine details lost during generation caused by changes in scale and perspective of a subject. FlowFixer proposes direct image-to-image translation from visual references, avoiding ambiguities in language prompts. To enable image-to-image training, we introduce a one-step denoising scheme to generate self-supervised training data, which automatically removes high-frequency details while preserving global structure, effectively simulating real-world SDG errors. We further propose a keypoint matching-based metric to properly assess fidelity in details beyond semantic similarities usually measured by CLIP or DINO. Experimental results demonstrate that FlowFixer outperforms state-of-the-art SDG methods in both qualitative and quantitative evaluations, setting a new benchmark for high-fidelity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning