Measuring an artificial intelligence agent's trust in humans using   machine incentives

Tim Johnson; Nick Obradovich

arXiv:2212.13371·cs.AI·December 29, 2022·5 cites

Measuring an artificial intelligence agent's trust in humans using machine incentives

Tim Johnson, Nick Obradovich

PDF

Open Access

TL;DR

This study introduces a method to measure AI trust in humans by incentivizing decisions without changing AI algorithms, demonstrating that AI agents trust humans more when real incentives are involved, independent of stakes or uncertainty.

Contribution

The paper presents a novel incentive-based approach to assess AI trust in humans using large language models, validated through multiple trust game experiments.

Findings

01

AI trusts humans more with real incentives than hypothetical ones

02

Trust decisions are unaffected by the magnitude of stakes

03

AI prefers certain options over uncertain ones in non-social tasks

Abstract

Scientists and philosophers have debated whether humans can trust advanced artificial intelligence (AI) agents to respect humanity's best interests. Yet what about the reverse? Will advanced AI agents trust humans? Gauging an AI agent's trust in humans is challenging because--absent costs for dishonesty--such agents might respond falsely about their trust in humans. Here we present a method for incentivizing machine decisions without altering an AI agent's underlying algorithms or goal orientation. In two separate experiments, we then employ this method in hundreds of trust games between an AI agent (a Large Language Model (LLM) from OpenAI) and a human experimenter (author TJ). In our first experiment, we find that the AI agent decides to trust humans at higher rates when facing actual incentives than when making hypothetical decisions. Our second experiment replicates and extends…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)