AttnMod: Attention-Based New Art Styles

Shih-Chieh Su

arXiv:2409.10028·cs.CV·August 4, 2025

AttnMod: Attention-Based New Art Styles

Shih-Chieh Su

PDF

Open Access

TL;DR

AttnMod is a training-free technique that modifies cross-attention in pre-trained diffusion models to create diverse, unpromptable art styles, enhancing the expressive capacity of text-to-image generation.

Contribution

It introduces a novel attention modulation method that enables stylistic transformations without retraining or changing prompts.

Findings

01

Enables diverse stylistic transformations

02

Does not require retraining or prompt changes

03

Expands expressive capacity of diffusion models

Abstract

We introduce AttnMod, a training-free technique that modulates cross-attention in pre-trained diffusion models to generate novel, unpromptable art styles. The method is inspired by how a human artist might reinterpret a generated image, for example by emphasizing certain features, dispersing color, twisting silhouettes, or materializing unseen elements. AttnMod simulates this intent by altering how the text prompt conditions the image through attention during denoising. These targeted modulations enable diverse stylistic transformations without changing the prompt or retraining the model, and they expand the expressive capacity of text-to-image generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAesthetic Perception and Analysis · Digital Media and Visual Art · Art Education and Development

MethodsSoftmax · Attention Is All You Need · Diffusion