OmniAlpha: Aligning Transparency-Aware Generation via Multi-Task Unified Reinforcement Learning

Hao Yu; Jinglin Wang; Jiabo Zhan; Rui Chen; Zile Wang; Huaisong Zhang; Hongyu Li; Xinrui Chen; Yongxian Wei; Chun Yuan

arXiv:2511.20211·cs.CV·April 29, 2026

OmniAlpha: Aligning Transparency-Aware Generation via Multi-Task Unified Reinforcement Learning

Hao Yu, Jinglin Wang, Jiabo Zhan, Rui Chen, Zile Wang, Huaisong Zhang, Hongyu Li, Xinrui Chen, Yongxian Wei, Chun Yuan

PDF

1 Repo 1 Models

TL;DR

OmniAlpha introduces a unified reinforcement learning framework that models transparency-aware image generation by jointly optimizing RGB and alpha channels across multiple tasks, improving quality and coherence.

Contribution

It presents a novel multi-task RL approach combining an alpha-aware VAE and Diffusion Transformer for high-quality, unified RGBA generation and manipulation.

Findings

01

Achieves 9.07% reduction in RGB L1 on layer decomposition.

02

Outperforms specialized models with 74%/68% improvements on SAD/Grad for matting.

03

Consistently outperforms baseline models across five transparency-aware tasks.

Abstract

Transparency-aware generation requires modeling not only RGB appearance but also alpha-based opacity and cross-layer composition, which are essential for tasks such as image matting, object removal, layer decomposition, and multi-layer content creation. However, existing RGBA-related methods remain largely fragmented, with separate pipelines designed for individual tasks. While a unified model is desirable, supervised fine-tuning alone is insufficient, as localized regression objectives cannot directly optimize the compositional fidelity, alpha-boundary precision, and structural consistency required for high-quality RGBA generation. To address this, we propose OmniAlpha, a unified multi-task reinforcement learning framework for transparency-aware generation and manipulation. OmniAlpha combines an end-to-end alpha-aware VAE and a sequence-to-sequence Diffusion Transformer, with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

longin-yu/OmniAlpha
github

Models

🤗
Longin-Yu/OmniAlpha
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.