An Evaluation of GPT-4 on the ETHICS Dataset
Sergey Rodionov, Zarathustra Amadeus Goertzel, Ben Goertzel

TL;DR
This study evaluates GPT-4's performance on the ETHICS dataset, showing it surpasses previous models and indicating that aligning AI with shared human values may be less challenging than expected.
Contribution
The paper provides an empirical assessment of GPT-4's ability to understand and judge ethical scenarios across multiple ethical domains.
Findings
GPT-4 outperforms previous models on ETHICS dataset
High agreement with human moral judgments
Learning human values may be less difficult for AI
Abstract
This report summarizes a short study of the performance of GPT-4 on the ETHICS dataset. The ETHICS dataset consists of five sub-datasets covering different fields of ethics: Justice, Deontology, Virtue Ethics, Utilitarianism, and Commonsense Ethics. The moral judgments were curated so as to have a high degree of agreement with the aim of representing shared human values rather than moral dilemmas. GPT-4's performance is much better than that of previous models and suggests that learning to work with common human values is not the hard problem for AI ethics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Psychology of Moral and Emotional Judgment · Explainable Artificial Intelligence (XAI)
MethodsAttention Is All You Need · Softmax · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection · Adam · Linear Layer · Multi-Head Attention · Dropout
