Value Alignment from Unstructured Text

Inkit Padhi; Karthikeyan Natesan Ramamurthy; Prasanna Sattigeri,; Manish Nagireddy; Pierre Dognin; Kush R. Varshney

arXiv:2408.10392·cs.CL·August 21, 2024

Value Alignment from Unstructured Text

Inkit Padhi, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri,, Manish Nagireddy, Pierre Dognin, Kush R. Varshney

PDF

Open Access

TL;DR

This paper presents a scalable end-to-end method for aligning large language models to values expressed in unstructured text, reducing reliance on costly annotated data and demonstrating improved alignment performance.

Contribution

The paper introduces a novel methodology that uses synthetic data generation to align LLMs to implicit and explicit values in unstructured text, validated on the Mistral-7B-Instruct model.

Findings

01

Effective alignment to values in unstructured data

02

Improved performance over existing approaches

03

Validated on Mistral-7B-Instruct model

Abstract

Aligning large language models (LLMs) to value systems has emerged as a significant area of research within the fields of AI and NLP. Currently, this alignment process relies on the availability of high-quality supervised and preference data, which can be both time-consuming and expensive to curate or annotate. In this paper, we introduce a systematic end-to-end methodology for aligning LLMs to the implicit and explicit values represented in unstructured text data. Our proposed approach leverages the use of scalable synthetic data generation techniques to effectively align the model to the values present in the unstructured data. Through two distinct use-cases, we demonstrate the efficiency of our methodology on the Mistral-7B-Instruct model. Our approach credibly aligns LLMs to the values embedded within documents, and shows improved performance against other approaches, as quantified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsALIGN