Towards Equitable Representation in Text-to-Image Synthesis Models with   the Cross-Cultural Understanding Benchmark (CCUB) Dataset

Zhixuan Liu; Youeun Shin; Beverley-Claire Okogwu; Youngsik Yun; Lia; Coleman; Peter Schaldenbrand; Jihie Kim; Jean Oh

arXiv:2301.12073·cs.CV·April 27, 2023·6 cites

Towards Equitable Representation in Text-to-Image Synthesis Models with the Cross-Cultural Understanding Benchmark (CCUB) Dataset

Zhixuan Liu, Youeun Shin, Beverley-Claire Okogwu, Youngsik Yun, Lia, Coleman, Peter Schaldenbrand, Jihie Kim, Jean Oh

PDF

Open Access 1 Repo

TL;DR

This paper introduces a culturally-aware fine-tuning approach for text-to-image models using the CCUB dataset, improving cultural relevance and reducing offensiveness in generated images through combined visual and semantic priming.

Contribution

It presents a novel method combining visual and semantic priming with a curated cultural dataset to enhance representation in text-to-image synthesis.

Findings

01

Improved cultural relevance in generated images.

02

Decreased offensiveness of outputs.

03

Maintained image quality after fine-tuning.

Abstract

It has been shown that accurate representation in media improves the well-being of the people who consume it. By contrast, inaccurate representations can negatively affect viewers and lead to harmful perceptions of other cultures. To achieve inclusive representation in generated images, we propose a culturally-aware priming approach for text-to-image synthesis using a small but culturally curated dataset that we collected, known here as Cross-Cultural Understanding Benchmark (CCUB) Dataset, to fight the bias prevalent in giant datasets. Our proposed approach is comprised of two fine-tuning techniques: (1) Adding visual context via fine-tuning a pre-trained text-to-image synthesis model, Stable Diffusion, on the CCUB text-image pairs, and (2) Adding semantic context via automated prompt engineering using the fine-tuned large language model, GPT-3, trained on our CCUB culturally-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cmubig/ccub
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computational and Text Analysis Methods

Methods15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Softmax · Cosine Annealing · Attention Dropout · Linear Warmup With Cosine Annealing · Byte Pair Encoding