OmniGen2: Towards Instruction-Aligned Multimodal Generation
Chenyuan Wu, Pengfei Zheng, Ruiran Yan, Shitao Xiao, Xin Luo, Yueze Wang, Wanli Li, Xiyan Jiang, Yexin Liu, Junjie Zhou, Ze Liu, Ziyi Xia, Chaofan Li, Haoge Deng, Jiahao Wang, Kun Luo, Bo Zhang, Defu Lian, Xinlong Wang, Zhongyuan Wang, Tiejun Huang, Zheng Liu

TL;DR
OmniGen2 is an open-source multimodal generative model that unifies text-to-image, editing, and in-context tasks with innovative decoding pathways and reflection mechanisms, achieving state-of-the-art open-source performance.
Contribution
It introduces a novel multimodal model with separate decoding pathways and a reflection mechanism, enhancing versatility and performance across diverse generation tasks.
Findings
Achieves competitive results on multiple benchmarks.
Sets state-of-the-art among open-source models for consistency.
Supports diverse tasks with a unified architecture.
Abstract
In this work, we introduce OmniGen2, a versatile and open-source generative model designed to provide a unified solution for diverse generation tasks, including text-to-image, image editing, and in-context generation. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer. This design enables OmniGen2 to build upon existing multimodal understanding models without the need to re-adapt VAE inputs, thereby preserving the original text generation capabilities. To facilitate the training of OmniGen2, we developed comprehensive data construction pipelines, encompassing image editing and in-context generation data. Additionally, we introduce a reflection mechanism tailored for image generation tasks and curate a dedicated reflection dataset based on OmniGen2. Despite its relatively modest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗OmniGen2/OmniGen2model· 3.5k dl· ♡ 4393.5k dl♡ 439
- 🤗jobs-git/OmniGen2model· 4 dl4 dl
- 🤗BAAI/OmniGen2model· 25 dl· ♡ 725 dl♡ 7
- 🤗OmniGen2/OmniGen2-EditScore7Bmodel· 65 dl· ♡ 765 dl♡ 7
- 🤗OmniGen2/OmniGen2-EditScore7B-v1.1model· 18 dl· ♡ 618 dl♡ 6
- 🤗OmniGen2/OmniGen2-RLmodel· 16 dl· ♡ 516 dl♡ 5
- 🤗Azily/Macro-OmniGen2model
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
