SPRI: Aligning Large Language Models with Context-Situated Principles

Hongli Zhan; Muneeza Azmat; Raya Horesh; Junyi Jessy Li; Mikhail Yurochkin

arXiv:2502.03397·cs.CL·May 30, 2025

SPRI: Aligning Large Language Models with Context-Situated Principles

Hongli Zhan, Muneeza Azmat, Raya Horesh, Junyi Jessy Li, Mikhail Yurochkin

PDF

Open Access 2 Models 2 Datasets 1 Video

TL;DR

SPRI is a framework that automatically generates real-time, context-specific guiding principles for large language models, improving alignment, performance, and truthfulness with minimal human effort.

Contribution

It introduces SPRI, a novel method for automatic, instance-specific principle generation to enhance LLM alignment and performance.

Findings

01

SPRI achieves on-par performance with expert-crafted principles in complex tasks.

02

SPRI-generated principles outperform prior LLM-as-a-judge frameworks.

03

Using SPRI for synthetic data improves truthfulness significantly.

Abstract

Aligning Large Language Models to integrate and reflect human values, especially for tasks that demand intricate human oversight, is arduous since it is resource-intensive and time-consuming to depend on human expertise for context-specific guidance. Prior work has utilized predefined sets of rules or principles to steer the behavior of models (Bai et al., 2022; Sun et al., 2023). However, these principles tend to be generic, making it challenging to adapt them to each individual input query or context. In this work, we present Situated-PRInciples (SPRI), a framework requiring minimal or no human effort that is designed to automatically generate guiding principles in real-time for each input query and utilize them to align each response. We evaluate SPRI on three tasks, and show that 1) SPRI can derive principles in a complex domain-specific task that leads to on-par performance as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

SPRI: Aligning Large Language Models with Context-Situated Principles· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods