A Comparative Analysis of Ethical and Safety Gaps in LLMs using Relative   Danger Coefficient

Yehor Tereshchenko; Mika H\"am\"al\"ainen

arXiv:2505.04654·cs.CL·May 9, 2025

A Comparative Analysis of Ethical and Safety Gaps in LLMs using Relative Danger Coefficient

Yehor Tereshchenko, Mika H\"am\"al\"ainen

PDF

Open Access

TL;DR

This paper introduces the Relative Danger Coefficient metric to compare ethical and safety risks across various large language models, emphasizing the importance of human oversight in high-stakes scenarios.

Contribution

It presents a novel metric, the Relative Danger Coefficient, for assessing harm in LLMs and provides a comparative analysis of multiple models' ethical performance.

Findings

01

DeepSeek-V3(R1) shows improved reasoning capabilities.

02

GPT variants exhibit varying safety profiles.

03

The RDC metric effectively quantifies potential harm.

Abstract

Artificial Intelligence (AI) and Large Language Models (LLMs) have rapidly evolved in recent years, showcasing remarkable capabilities in natural language understanding and generation. However, these advancements also raise critical ethical questions regarding safety, potential misuse, discrimination and overall societal impact. This article provides a comparative analysis of the ethical performance of various AI models, including the brand new DeepSeek-V3(R1 with reasoning and without), various GPT variants (4o, 3.5 Turbo, 4 Turbo, o1/o3 mini) and Gemini (1.5 flash, 2.0 flash and 2.0 flash exp) and highlights the need for robust human oversight, especially in situations with high stakes. Furthermore, we present a new metric for calculating harm in LLMs called Relative Danger Coefficient (RDC).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · Linear Layer · Multi-Head Attention · Dense Connections · Discriminative Fine-Tuning · Adam