oboro: Text-to-Image Synthesis on Limited Data using Flow-based Diffusion Transformer with MMH Attention
Ryusuke Mizutani, Kazuaki Matano, Tsugumi Kadowaki, Haruki Tenya, Layris, nuigurumi, Koki Hashimoto, Yu Tanaka

TL;DR
This paper introduces 'oboro,' a novel Japanese-developed image generation model capable of producing high-quality images from limited data, utilizing a flow-based diffusion transformer with MMH attention, and is openly available for commercial use.
Contribution
The paper presents 'oboro,' the first open-source Japanese image generation model trained from scratch on limited data with a unique architecture and publicly released weights.
Findings
High-quality images generated from limited datasets
Open-source model available for commercial use
First Japanese-developed image generation AI
Abstract
This project was conducted as a 2nd-term adopted project of the "Post-5G Information and Communication System Infrastructure Enhancement R&D Project Development of Competitive Generative AI Foundation Models (GENIAC)," a business of the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO). To address challenges such as labor shortages in Japan's anime production industry, this project aims to develop an image generation model from scratch. This report details the technical specifications of the developed image generation model, "oboro:." We have developed "oboro:," a new image generation model built from scratch, using only copyright-cleared images for training. A key characteristic is its architecture, designed to generate high-quality images even from limited datasets. The foundation model weights and inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Artificial Intelligence Applications · Advanced Text Analysis Techniques
