ValueFlow: Measuring the Propagation of Value Perturbations in Multi-Agent LLM Systems
Jinnuo Liu, Chuke Liu, Hua Shen

TL;DR
ValueFlow is a framework that measures how value perturbations propagate in multi-agent LLM systems, revealing the influence of agent interactions and system structure on value drift.
Contribution
It introduces a novel evaluation dataset and metrics to analyze value drift and susceptibility in multi-agent LLM systems, advancing understanding of value propagation.
Findings
Susceptibility varies across different values.
Structural topology significantly influences value propagation.
Agents show diverse responses to value perturbations.
Abstract
Multi-agent large language model (LLM) systems increasingly consist of agents that observe and respond to one another's outputs. While value alignment is typically evaluated for isolated models, how value perturbations propagate through agent interactions remains poorly understood. We present ValueFlow, a perturbation-based evaluation framework for measuring and analyzing value drift in multi-agent systems. ValueFlow introduces a 56-value evaluation dataset derived from the Schwartz Value Survey and quantifies agents' value orientations during interaction using an LLM-as-a-judge protocol. Building on this measurement layer, ValueFlow decomposes value drift into agent-level response behavior and system-level structural effects, operationalized by two metrics: beta-susceptibility, which measures an agent's sensitivity to perturbed peer signals, and system susceptibility (SS), which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Language and cultural evolution · Natural Language Processing Techniques
