Towards interactive evaluations for interaction harms in human-AI systems

Lujain Ibrahim; Saffron Huang; Umang Bhatt; Lama Ahmad; Markus Anderljung

arXiv:2405.10632·cs.CY·July 31, 2025·2 cites

Towards interactive evaluations for interaction harms in human-AI systems

Lujain Ibrahim, Saffron Huang, Umang Bhatt, Lama Ahmad, Markus Anderljung

PDF

Open Access

TL;DR

This paper advocates for a new evaluation paradigm for human-AI systems that emphasizes interactional ethics and the assessment of long-term interaction harms, moving beyond static, model-only testing methods.

Contribution

It introduces principles for designing interactive evaluations that account for social and ethical impacts over time, addressing limitations of current static evaluation approaches.

Findings

01

Current evaluation methods are static and limited in scope.

02

Proposes practical principles for interactive evaluation design.

03

Highlights challenges and open questions for implementing interactive assessments.

Abstract

Current AI evaluation methods, which rely on static, model-only tests, fail to account for harms that emerge through sustained human-AI interaction. As AI systems proliferate and are increasingly integrated into real-world applications, this disconnect between evaluation approaches and actual usage becomes more significant. In this paper, we propose a shift towards evaluation based on \textit{interactional ethics}, which focuses on \textit{interaction harms} - issues like inappropriate parasocial relationships, social manipulation, and cognitive overreliance that develop over time through repeated interaction, rather than through isolated outputs. First, we discuss the limitations of current evaluation methods, which (1) are static, (2) assume a universal user experience, and (3) have limited construct validity. Drawing on research from human-computer interaction, natural language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI

MethodsFocus