OmniPSD: Layered PSD Generation with Diffusion Transformer

Cheng Liu; Yiren Song; Haofan Wang; Mike Zheng Shou

arXiv:2512.09247·cs.CV·December 11, 2025

OmniPSD: Layered PSD Generation with Diffusion Transformer

Cheng Liu, Yiren Song, Haofan Wang, Mike Zheng Shou

PDF

Open Access

TL;DR

OmniPSD introduces a diffusion transformer-based framework capable of generating and decomposing layered PSD files with transparency, advancing image editing and design automation.

Contribution

It presents a unified diffusion model for layered PSD generation and decomposition, incorporating spatial attention and an RGBA-VAE for transparency preservation.

Findings

01

High-fidelity layered PSD generation

02

Structural consistency in output layers

03

Effective transparency preservation

Abstract

Recent advances in diffusion models have greatly improved image generation and editing, yet generating or reconstructing layered PSD files with transparent alpha channels remains highly challenging. We propose OmniPSD, a unified diffusion framework built upon the Flux ecosystem that enables both text-to-PSD generation and image-to-PSD decomposition through in-context learning. For text-to-PSD generation, OmniPSD arranges multiple target layers spatially into a single canvas and learns their compositional relationships through spatial attention, producing semantically coherent and hierarchically structured layers. For image-to-PSD decomposition, it performs iterative in-context editing, progressively extracting and erasing textual and foreground components to reconstruct editable PSD layers from a single flattened image. An RGBA-VAE is employed as an auxiliary representation module to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Digital Humanities and Scholarship