AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Bhada Yun, Renn Su, April Yi Wang

TL;DR
This paper introduces VAPT, a toolkit for evaluating how well large language models reflect and explain human values through casual conversations, highlighting perceptions and potential risks.
Contribution
It presents VAPT as a novel method for assessing AI's ability to extract, embody, and explain human values, along with design implications for responsible AI development.
Findings
13 out of 20 participants believed AI can understand human values
Participants expressed concerns about 'weaponized empathy' in AI interactions
VAPT provides a new framework for evaluating value-alignment in AI systems
Abstract
Does AI understand human values? While this remains an open philosophical question, we take a pragmatic stance by introducing VAPT, the Value-Alignment Perception Toolkit, for studying how LLMs reflect people's values and how people judge those reflections. 20 participants texted a chatbot over a month, then completed a 2-hour interview with our toolkit evaluating AI's ability to extract (pull details regarding), embody (make decisions guided by), and explain (provide proof of) their values. 13 participants ultimately left our study convinced that AI can understand human values. Thus, we warn about "weaponized empathy": a design pattern that may arise in interactions with value-aware, yet welfare-misaligned conversational agents. VAPT offers a new way to evaluate value-alignment in AI systems. We also offer design implications to evaluate and responsibly build AI systems with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
