On Fairness of Unified Multimodal Large Language Model for Image Generation
Ming Liu, Hao Chen, Jindong Wang, Liwen Wang, Bhiksha Raj, Ramakrishnan, Wensheng Zhang

TL;DR
This paper investigates demographic biases in unified multimodal large language models for image generation, identifies bias sources, and proposes a balanced preference model to mitigate bias while maintaining output quality.
Contribution
It benchmarks U-MLLMs for bias, introduces a locate-then-fix debiasing strategy, and proposes a balanced preference model to reduce demographic bias effectively.
Findings
Most U-MLLMs exhibit significant gender and race bias.
Bias mainly originates from the language model component.
The proposed balanced preference model reduces bias while preserving semantic fidelity.
Abstract
Unified multimodal large language models (U-MLLMs) have demonstrated impressive performance in visual understanding and generation in an end-to-end pipeline. Compared with generation-only models (e.g., Stable Diffusion), U-MLLMs may raise new questions about bias in their outputs, which can be affected by their unified capabilities. This gap is particularly concerning given the under-explored risk of propagating harmful stereotypes. In this paper, we benchmark the latest U-MLLMs and find that most exhibit significant demographic biases, such as gender and race bias. To better understand and mitigate this issue, we propose a locate-then-fix strategy, where we audit and show how the individual model component is affected by bias. Our analysis shows that bias originates primarily from the language model. More interestingly, we observe a "partial alignment" phenomenon in U-MLLMs, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Multimodal Machine Learning Applications · AI in cancer detection
