Class-Conditional self-reward mechanism for improved Text-to-Image   models

Safouane El Ghazouali; Arnaud Gucciardi; Umberto Michelucci

arXiv:2405.13473·cs.CV·May 28, 2024

Class-Conditional self-reward mechanism for improved Text-to-Image models

Safouane El Ghazouali, Arnaud Gucciardi, Umberto Michelucci

PDF

Open Access 1 Repo

TL;DR

This paper introduces a class-conditional self-reward mechanism for Text-to-Image models, enhancing image quality, automation, and prompt adherence by fine-tuning diffusion models with self-generated data and auxiliary pre-trained models.

Contribution

It presents a novel self-rewarding approach for Text-to-Image models, improving quality and automation through fine-tuning with self-judged data conditioned on object sets.

Findings

01

Performance improved by at least 60% over existing models.

02

Automated image generation with higher visual quality.

03

Enhanced adherence to prompt instructions.

Abstract

Self-rewarding have emerged recently as a powerful tool in the field of Natural Language Processing (NLP), allowing language models to generate high-quality relevant responses by providing their own rewards during training. This innovative technique addresses the limitations of other methods that rely on human preferences. In this paper, we build upon the concept of self-rewarding models and introduce its vision equivalent for Text-to-Image generative AI models. This approach works by fine-tuning diffusion model on a self-generated self-judged dataset, making the fine-tuning more automated and with better data quality. The proposed mechanism makes use of other pre-trained models such as vocabulary based-object detection, image captioning and is conditioned by the a set of object for which the user might need to improve generated data quality. The approach has been implemented,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

safouaneelg/SRT2I
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBrain Tumor Detection and Classification · Image Retrieval and Classification Techniques · Advanced Text Analysis Techniques

MethodsSparse Evolutionary Training · Diffusion