VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

Jiawei Chen; Tianzhuo Yang; Guoxi Zhang; Jiaming Ji; Yaodong Yang; Juntao Dai

arXiv:2603.04822·cs.AI·March 6, 2026

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

Jiawei Chen, Tianzhuo Yang, Guoxi Zhang, Jiaming Ji, Yaodong Yang, Juntao Dai

PDF

Open Access

TL;DR

VISA introduces a novel framework for fine-grained, precise value alignment in LLMs that mitigates the alignment tax and preserves semantic integrity, outperforming traditional fine-tuning and prompting methods.

Contribution

The paper presents VISA, a new closed-loop framework with a value detector, translator, and rewriter, trained via GRPO to balance value precision and semantic preservation in LLMs.

Findings

01

VISA achieves better value alignment with less semantic drift.

02

It outperforms standard fine-tuning and prompting baselines.

03

The approach maintains factual consistency and general capabilities.

Abstract

Aligning Large Language Models (LLMs) with nuanced human values remains a critical challenge, as existing methods like Reinforcement Learning from Human Feedback (RLHF) often handle only coarse-grained attributes. In practice, fine-tuning LLMs on task-specific datasets to optimize value alignment inevitably incurs an alignment tax: the model's pre-calibrated value system drifts significantly due to latent bias absorption from training data, while the fine-tuning process also causes severe hallucinations and semantic information loss in generated responses. To address this, we propose VISA (Value Injection via Shielded Adaptation), a closed-loop framework designed to navigate this trade-off. VISA's architecture features a high-precision value detector, a semantic-to-value translator, and a core value-rewriter. The value-rewriter is trained via Group Relative Policy Optimization (GRPO)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications