Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Haonan Li, Xudong Han, Zenan Zhai, Honglin Mu, Hao Wang, Zhenxuan, Zhang, Yilin Geng, Shom Lin, Renxi Wang, Artem Shelmanov, Xiangyu Qi, Yuxia, Wang, Donghai Hong, Youliang Yuan, Meng Chen, Haoqin Tu, Fajri Koto, Tatsuki, Kuribayashi, Cong Zeng, Rishabh Bhardwaj, Bingchen Zhao

TL;DR
Libra-Leaderboard introduces a balanced evaluation framework for LLMs that jointly assesses performance and safety, promoting models that optimize both aspects rather than excelling in one at the expense of the other.
Contribution
It presents a novel balanced ranking method using a distance-to-optimal-score approach and a dynamic leaderboard to encourage responsible AI development.
Findings
Evaluated 26 mainstream LLMs revealing safety challenges.
The balanced ranking incentivizes models to improve both safety and capability.
The framework promotes responsible AI through joint optimization.
Abstract
To address this gap, we introduce Libra-Leaderboard, a comprehensive framework designed to rank LLMs through a balanced evaluation of performance and safety. Combining a dynamic leaderboard with an interactive LLM arena, Libra-Leaderboard encourages the joint optimization of capability and safety. Unlike traditional approaches that average performance and safety metrics, Libra-Leaderboard uses a distance-to-optimal-score method to calculate the overall rankings. This approach incentivizes models to achieve a balance rather than excelling in one dimension at the expense of some other ones. In the first release, Libra-Leaderboard evaluates 26 mainstream LLMs from 14 leading organizations, identifying critical safety challenges even in state-of-the-art models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI
