Seeking Human Security Consensus: A Unified Value Scale for Generative AI Value Safety
Ying He, Baiyang Li, Yule Cao, Huirun Xu, Qiuxian Chen, Shu Chen, Shangsheng Ren

TL;DR
This paper introduces a comprehensive, internationally inclusive framework for assessing and improving value safety in generative AI, highlighting current disparities and proposing a pathway toward shared safety standards.
Contribution
It presents the GVS-Scale, a unified value safety framework for generative AI, including risk taxonomy, an incident repository, and an evaluation benchmark, to promote global safety consensus.
Findings
Significant variation in value safety performance across models
Current systems show uneven and fragmented value alignment
Shared safety foundations are essential for progress
Abstract
The rapid development of generative AI has brought value- and ethics-related risks to the forefront, making value safety a critical concern while a unified consensus remains lacking. In this work, we propose an internationally inclusive and resilient unified value framework, the GenAI Value Safety Scale (GVS-Scale): Grounded in a lifecycle-oriented perspective, we develop a taxonomy of GenAI value safety risks and construct the GenAI Value Safety Incident Repository (GVSIR), and further derive the GVS-Scale through grounded theory and operationalize it via the GenAI Value Safety Benchmark (GVS-Bench). Experiments on mainstream text generation models reveal substantial variation in value safety performance across models and value categories, indicating uneven and fragmented value alignment in current systems. Our findings highlight the importance of establishing shared safety foundations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
