Runtime Monitoring and Enforcement of Conditional Fairness in Generative AIs
Chih-Hong Cheng, Changshun Wu, Xingyu Zhao, Saddek Bensalem, Harald Ruess

TL;DR
This paper introduces methods for runtime monitoring and enforcement of conditional fairness in generative AI, addressing fairness concerns specific to broad, context-dependent outputs and proposing techniques to ensure fairness thresholds are maintained.
Contribution
It presents novel characterization and enforcement techniques for fairness in GenAI, including bounding worst-case unfairness and a prompt injection scheme for enforcement.
Findings
Effective enforcement of fairness thresholds in GenAI models
Validation of methods on state-of-the-art systems
Introduction of combinatorial testing for intersectional fairness
Abstract
The deployment of generative AI (GenAI) models raises significant fairness concerns, addressed in this paper through novel characterization and enforcement techniques specific to GenAI. Unlike standard AI performing specific tasks, GenAI's broad functionality requires ``conditional fairness'' tailored to the context being generated, such as demographic fairness in generating images of poor people versus successful business leaders. We define two fairness levels: the first evaluates fairness in generated outputs, independent of prompts and models; the second assesses inherent fairness with neutral prompts. Given the complexity of GenAI and challenges in fairness specifications, we focus on bounding the worst case, considering a GenAI system unfair if the distance between appearances of a specific group exceeds preset thresholds. We also explore combinatorial testing for assessing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsFocus
