Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment

Jiajun Chen; Hua Shen

arXiv:2602.12134·cs.AI·April 28, 2026

Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment

Jiajun Chen, Hua Shen

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces VAT, a framework to measure how alignment interventions in LLMs cause shifts and trade-offs among interconnected values, revealing unintended effects often hidden in traditional evaluations.

Contribution

VAT provides a systematic way to quantify value trade-offs and system-level dynamics of value expression in LLMs under various alignment interventions.

Findings

01

Alignment causes uneven co-movement among values.

02

Traditional evaluations miss many unintended value shifts.

03

VAT reveals systematic trade-offs between target and non-target values.

Abstract

Existing work on value alignment typically characterizes value relations statically, ignoring how alignment interventions, such as prompting, fine-tuning, or preference optimization, reshape the broader value system. In practice, aligning a target value can implicitly shift other values, creating value trade-offs that remain largely unmeasured. We introduce VAT, a framework that quantifies value trade-offs by measuring how alignment-induced changes propagate across interconnected values relative to achieved on-target gain. VAT captures the system-level dynamics of value expression under alignment intervention, enabling evaluation of both intended improvements and unintended side effects. Using a controlled scenario-action dataset grounded in Schwartz value theory, we collect paired pre-post normative judgments and analyze alignment effects across models, values, and interventions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Datasets

Tinyhope/Value_Alignment_Tax
dataset· 53 dl
53 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.