DeRaDiff: Denoising Time Realignment of Diffusion Models

Ratnavibusena Don Shahain Manujith; Teoh Tze Tzun; Kenji Kawaguchi; Yang Zhang

arXiv:2601.20198·cs.LG·February 23, 2026

DeRaDiff: Denoising Time Realignment of Diffusion Models

Ratnavibusena Don Shahain Manujith, Teoh Tze Tzun, Kenji Kawaguchi, Yang Zhang

PDF

Open Access 3 Reviews

TL;DR

DeRaDiff introduces a novel decoding-time realignment method for diffusion models that efficiently adjusts regularization strength during sampling, matching the performance of models trained at various strengths without additional training.

Contribution

It proposes a new realignment technique that modulates regularization strength on-the-fly, reducing computational costs and eliminating the need for multiple model alignments.

Findings

01

Consistently approximates models aligned at different regularization strengths.

02

Reduces computational costs by avoiding multiple training runs.

03

Improves text-image alignment and image quality metrics.

Abstract

Recent advances align diffusion models with human preferences to increase aesthetic appeal and mitigate artifacts and biases. Such methods aim to maximize a conditional output distribution aligned with higher rewards whilst not drifting far from a pretrained prior. This is commonly enforced by KL (Kullback Leibler) regularization. As such, a central issue still remains: how does one choose the right regularization strength? Too high of a strength leads to limited alignment and too low of a strength leads to "reward hacking". This renders the task of choosing the correct regularization strength highly non-trivial. Existing approaches sweep over this hyperparameter by aligning a pretrained model at multiple regularization strengths and then choose the best strength. Unfortunately, this is prohibitively expensive. We introduce DeRaDiff, a denoising time realignment procedure that, after…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 8Confidence 3

Strengths

- The paper shows that DeRaDiff has a closed-form solution. - The training-free inference-time alignment method saves computational costs - The method is able to undo reward hacking, eliminating the need for realignment

Weaknesses

- When evaluating human alignment a real-world human study would have been nice - Since the pre-trained reference model is mixed with the aligned model, this could lead to biases reappearing in the output of DeRaDiff.

Reviewer 02Rating 6Confidence 4

Strengths

Overall, the paper makes a worthwhile contribution with a clear presentation and credible theoretical support. ## Presentation: ~95th percentile This paper presents coherence, and most of the idea is clearly addressed. I would like to thank you for saving me a lot of time reviewing your work. ## Soundness: ~75th percentile Theorem 1 underpins the soundness of the paper. Although I have not examined every minute detail, the derivation appears to be correct. ## Contribution: 40th~70th percen

Weaknesses

## Soundness I would have preferred to see additional comparisons between your method and other approaches applied to similar problems. Nonetheless, the absence of such comparisons does not undermine the validity of your claim. ## Presentation 1. Presenting the denominator in Equations (3)–(6) as a partition function has both advantages and disadvantages. While it clarifies the interpretation, the repeated form of the equations feels redundant. If the repetition is intentional, please just

Reviewer 03Rating 6Confidence 3

Strengths

- The paper is reasonably well-written with a coherent narrative. - The idea of extending DeRa (from LLMs) to diffusion models is an interesting angle. - The results are rather promising, and align with the core claims.

Weaknesses

- I believe only Fig. 7 for comparison across base, aligned and realigned models is too little, also not discussed in necessary level of detail. In my eyes this should be established with more qualitative results and further elaboration across the images and models. - The paper would benefit from a thorough proof-read. Few typos, and styling inconsistencies can be seem across the document. - Maybe (pareto-front) reward vs divergence plots can help establish the core message from a different a

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis · Domain Adaptation and Few-Shot Learning