Multilingual Conceptual Coverage in Text-to-Image Models
Michael Saxon, William Yang Wang

TL;DR
This paper introduces CoCo-CroLa, a benchmarking technique to evaluate how well text-to-image models provide consistent and equitable image generation across multiple languages, focusing on tangible nouns.
Contribution
The paper presents a novel method for assessing multilingual parity in text-to-image models, enabling evaluation of conceptual coverage across languages without prior assumptions.
Findings
CoCo-CroLa effectively benchmarks multilinguality in T2I models.
The technique reveals model-specific weaknesses and biases.
It serves as a good proxy for generalization in multilingual settings.
Abstract
We propose "Conceptual Coverage Across Languages" (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns. For each model we can assess "conceptual coverage" of a given target language relative to a source language by comparing the population of images generated for a series of tangible nouns in the source language to the population of images generated for each noun under translation in the target language. This technique allows us to estimate how well-suited a model is to a target language as well as identify model-specific weaknesses, spurious correlations, and biases without a-priori assumptions. We demonstrate how it can be used to benchmark T2I models in terms of multilinguality, and how despite its simplicity it is a good proxy for impressive generalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
