OmniPrism: Learning Disentangled Visual Concept for Image Generation

Yangyang Li; Daqing Liu; Wu Liu; Allen He; Xinchen Liu; Yongdong Zhang; Guoqing Jin

arXiv:2412.12242·cs.CV·April 13, 2026

OmniPrism: Learning Disentangled Visual Concept for Image Generation

Yangyang Li, Daqing Liu, Wu Liu, Allen He, Xinchen Liu, Yongdong Zhang, Guoqing Jin

PDF

TL;DR

OmniPrism introduces a novel approach for disentangling visual concepts guided by natural language, enabling more creative and accurate image generation with high fidelity to prompts.

Contribution

The paper presents a new contrastive orthogonal disentangled training pipeline and a paired dataset for learning disentangled concepts in diffusion models.

Findings

01

Achieves high-quality, concept-disentangled image generation.

02

Effectively incorporates multiple concepts guided by natural language.

03

Demonstrates superior performance over existing methods.

Abstract

Creative visual concept generation often draws inspiration from specific concepts in a reference image to produce relevant outcomes. However, existing methods are typically constrained to single-aspect concept generation or are easily disrupted by irrelevant concepts in multi-aspect concept scenarios, leading to concept confusion and hindering creative generation. To address this, we propose OmniPrism, a visual concept disentangling approach for creative image generation. Our method learns disentangled concept representations guided by natural language and trains a diffusion model to incorporate these concepts. We utilize the rich semantic space of a multimodal extractor to achieve concept disentanglement from given images and concept guidance. To disentangle concepts with different semantics, we construct a paired concept disentangled dataset (PCD-200K), where each pair shares the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.