MAST: Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer

Dongkyung Kang; Jaeyeon Hwang; Junseo Park; Minji Kang; Yeryeong Lee; Beomseok Ko; Hanyoung Roh; Jeongmin Shin; Hyeryung Jang

arXiv:2604.12281·cs.CV·April 15, 2026

MAST: Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer

Dongkyung Kang, Jaeyeon Hwang, Junseo Park, Minji Kang, Yeryeong Lee, Beomseok Ko, Hanyoung Roh, Jeongmin Shin, Hyeryung Jang

PDF

TL;DR

MAST is a training-free multi-style transfer framework that uses mask-guided attention to control style interactions, ensuring artifact-free, structure-preserving stylization across multiple styles.

Contribution

It introduces a novel attention mechanism with four modules to improve multi-style diffusion-based style transfer without training.

Findings

01

Effectively mitigates boundary artifacts in multi-style transfer.

02

Maintains structural consistency and texture fidelity.

03

Performs well even with multiple styles applied.

Abstract

Style transfer aims to render a content image with the visual characteristics of a reference style while preserving its underlying semantic layout and structural geometry. While recent diffusion-based models demonstrate strong stylization capabilities by leveraging powerful generative priors and controllable internal representations, they typically assume a single global style. Extending them to multi-style scenarios often leads to boundary artifacts, unstable stylization, and structural inconsistency due to interference between multiple style representations. To overcome these limitations, we propose MAST (Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer), a novel training-free framework that explicitly controls content-style interactions within the diffusion attention mechanism. To achieve artifact-free and structure-preserving stylization, MAST integrates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.