Score Replacement with Bounded Deviation for Rare Prompt Generation
Bo-Kai Ruan, Zi-Xiang Ni, Bo-Lun Huang, Teng-Fang Hsiao, Hong-Han Shuai

TL;DR
This paper introduces a score replacement method with bounded deviation for improved rare prompt generation in diffusion models, enabling adaptive switching between proxy and true scores to better synthesize rare concepts.
Contribution
It proposes a novel score replacement framework with a bounded deviation criterion, allowing adaptive control over prompt switching in diffusion models for rare concept generation.
Findings
Consistently improves rare concept synthesis across multiple models.
Outperforms strong baselines in automated metrics.
Enhances human evaluation scores.
Abstract
Diffusion models achieve impressive performance in high-fidelity image generation but often struggle with rare concepts that appear infrequently in the training distribution. Prior work attempts to address this issue by prompt switching, where generation begins with a frequent proxy prompt and later transitions to the original rare prompt. However, such designs typically rely on fixed schedules that disregard the model's internal dynamics, making them brittle across prompts and backbones. In this paper, we re-frame rare prompt generation through the lens of score replacement: the denoising trajectory of a rare prompt can be initially guided by the score of a semantically related frequent prompt, which acts as a proxy. However, as the process unfolds, the proxy score gradually diverges from the true rare prompt score. To control this drift, we introduce a bounded deviation criterion that…
Peer Reviews
Decision·Submitted to ICLR 2026
- **Adaptive scheduling.** The paper replaces R2F’s fixed, heuristic switching rule with an adaptive scheduling using the score of the rare prompt. - **Comprehensive analyses.** It provides diverse supporting analyses, including score-trajectory visualization, cross-model comparisons, and derivations that validate the method’s design. - **Theoretical grounding via score approximation.** The method offers a theoretical perspective that connects the adaptive switching behavior to the score appro
- **Limited methodological novelty.** While the paper’s attempt to make R2F’s fixed strategy adaptive is meaningful, the method itself is not substantially novel and largely builds on existing switching ideas. - **Optimality of switching approach.** It remains unclear whether prompt switching is an optimal formulation for rare-concept generation. Since this strategy inherently depends on finding a proxy prompt, it may be less efficient or generalizable than other rare-concept generation methods
1. It reframes prompt switching as a score-aware control problem, introducing an adaptive score-replacement trigger instead of brittle, fixed schedules. 2. A clear theoretical bound ties final-sample deviation to accumulated score differences, giving principled guidance for when to switch prompts. 3. The method is practical and backbone-agnostic, requiring no architectural changes or extra training, and works across SDXL/SD3/Flux/Sana. Empirically, it consistently outperforms R2F on rare-prompt
1. Score-difference tracking and similarity-based budgeting introduce computational overhead. However, the paper does not quantify this cost—e.g., wall-clock time, GPU-hours, and memory. Therefore, comparing these metrics against prompt-switching baselines is necessary. 2. In light of the possible brittleness of early-regime budgeting across models and prompts, please report robustness evidence or adaptation strategies spanning backbone changes, prompt diversity, and noise-schedule shifts. 3.
- The paper is well written and systematically analyzes the limitations of existing rare-prompt switching pipelines. The proposed controller addresses those issues cleanly and yields consistent improvements across models.
- The method's effectiveness may lean on several choices and hyperparameters that feel more engineered than principled. The bucket threshold is estimated from an empirically detected stable region, which can depend on the backbone generative model. Category- and model-specific thresholds further improve results but also signal configuration fragility: gains partly come from per-benchmark tailoring rather than a single robust rule. - Minor: While the paper substantially strengthens R2F with a pr
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications · Neural Networks and Applications
