TP-Blend: Textual-Prompt Attention Pairing for Precise Object-Style Blending in Diffusion Models

Xin Jin; Yichuan Zhong; Yapeng Tian

arXiv:2601.08011·cs.CV·March 3, 2026

TP-Blend: Textual-Prompt Attention Pairing for Precise Object-Style Blending in Diffusion Models

Xin Jin, Yichuan Zhong, Yapeng Tian

PDF

Open Access

TL;DR

TP-Blend is a novel, training-free framework that enables precise object and style blending in diffusion models by using dual attention mechanisms, resulting in high-quality, controllable image edits.

Contribution

It introduces a dual-attention approach with CAOF and SASF for simultaneous object and style blending without additional training.

Findings

01

Produces high-resolution, photo-realistic edits

02

Surpasses recent baselines in fidelity and quality

03

Operates with fast inference speed

Abstract

Current text-conditioned diffusion editors handle single object replacement well but struggle when a new object and a new style must be introduced simultaneously. We present Twin-Prompt Attention Blend (TP-Blend), a lightweight training-free framework that receives two separate textual prompts, one specifying a blend object and the other defining a target style, and injects both into a single denoising trajectory. TP-Blend is driven by two complementary attention processors. Cross-Attention Object Fusion (CAOF) first averages head-wise attention to locate spatial tokens that respond strongly to either prompt, then solves an entropy-regularised optimal transport problem that reassigns complete multi-head feature vectors to those positions. CAOF updates feature vectors at the full combined dimensionality of all heads (e.g., 640 dimensions in SD-XL), preserving rich cross-head correlations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Neural Network Applications