SHIFT: Steering Hidden Intermediates in Flow Transformers

Nina Konovalova; Andrey Kuznetsov; Aibek Alanov

arXiv:2604.09213·cs.CV·April 13, 2026

SHIFT: Steering Hidden Intermediates in Flow Transformers

Nina Konovalova, Andrey Kuznetsov, Aibek Alanov

PDF

TL;DR

SHIFT is a lightweight framework that enables concept removal and style shifting in diffusion models by steering intermediate activations during inference, without retraining.

Contribution

It introduces a novel activation steering method for controlling diffusion model outputs, allowing concept suppression and style shifting dynamically at inference time.

Findings

01

SHIFT effectively suppresses unwanted concepts in generated images.

02

It enables style transfer and object modification without retraining.

03

The method maintains high image quality while providing flexible control.

Abstract

Diffusion models have become leading approaches for high-fidelity image generation. Recent DiT-based diffusion models, in particular, achieve strong prompt adherence while producing high-quality samples. We propose SHIFT, a simple but effective and lightweight framework for concept removal in DiT diffusion models via targeted manipulation of intermediate activations at inference time, inspired by activation steering in large language models. SHIFT learns steering vectors that are dynamically applied to selected layers and timesteps to suppress unwanted visual concepts while preserving the prompt's remaining content and overall image quality. Beyond suppression, the same mechanism can shift generations into a desired \emph{style domain} or bias samples toward adding or changing target objects. We demonstrate that SHIFT provides effective and flexible control over DiT generation across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.