Ethics2vec: aligning automatic agents and human preferences

Gianluca Bontempi

arXiv:2508.07673·cs.AI·August 12, 2025

Ethics2vec: aligning automatic agents and human preferences

Gianluca Bontempi

PDF

Open Access

TL;DR

Ethics2Vec introduces a method to represent and compare AI decision-making strategies with human ethical values using vector embeddings, addressing the challenge of aligning AI systems with complex, incommensurable human values.

Contribution

The paper extends the Anything2Vec approach to ethics, proposing a vectorization method for automatic agents to assess alignment with human values.

Findings

01

Proposes Ethics2Vec for ethical alignment of AI agents.

02

Demonstrates vectorization of binary decision-making.

03

Extends approach to automatic control systems.

Abstract

Though intelligent agents are supposed to improve human experience (or make it more efficient), it is hard from a human perspective to grasp the ethical values which are explicitly or implicitly embedded in an agent behaviour. This is the well-known problem of alignment, which refers to the challenge of designing AI systems that align with human values, goals and preferences. This problem is particularly challenging since most human ethical considerations refer to \emph{incommensurable} (i.e. non-measurable and/or incomparable) values and criteria. Consider, for instance, a medical agent prescribing a treatment to a cancerous patient. How could it take into account (and/or weigh) incommensurable aspects like the value of a human life and the cost of the treatment? Now, the alignment between human and artificial values is possible only if we define a common space where a metric can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Logic, Reasoning, and Knowledge · Reinforcement Learning in Robotics