Pluralistic Behavior Suite: Stress-Testing Multi-Turn Adherence to Custom Behavioral Policies

Prasoon Varshney; Makesh Narsimhan Sreedhar; Liwei Jiang; Traian Rebedea; Christopher Parisien

arXiv:2511.05018·cs.CL·November 10, 2025

Pluralistic Behavior Suite: Stress-Testing Multi-Turn Adherence to Custom Behavioral Policies

Prasoon Varshney, Makesh Narsimhan Sreedhar, Liwei Jiang, Traian Rebedea, Christopher Parisien

PDF

Open Access

TL;DR

This paper introduces PBSUITE, a comprehensive evaluation framework and dataset for testing LLMs' adherence to diverse, real-world behavioral policies in multi-turn conversations, revealing significant compliance challenges under adversarial conditions.

Contribution

We present PBSUITE, a novel dynamic evaluation suite with a large dataset and stress-testing framework for assessing pluralistic alignment of LLMs in complex, multi-turn interactions.

Findings

01

Models adhere well in single-turn settings (<4% failure)

02

Compliance drops significantly in multi-turn adversarial interactions (up to 84% failure)

03

Existing alignment methods are insufficient for real-world, pluralistic scenarios

Abstract

Large language models (LLMs) are typically aligned to a universal set of safety and usage principles intended for broad public acceptability. Yet, real-world applications of LLMs often take place within organizational ecosystems shaped by distinctive corporate policies, regulatory requirements, use cases, brand guidelines, and ethical commitments. This reality highlights the need for rigorous and comprehensive evaluation of LLMs with pluralistic alignment goals, an alignment paradigm that emphasizes adaptability to diverse user values and needs. In this work, we present PLURALISTIC BEHAVIOR SUITE (PBSUITE), a dynamic evaluation suite designed to systematically assess LLMs' capacity to adhere to pluralistic alignment specifications in multi-turn, interactive conversations. PBSUITE consists of (1) a diverse dataset of 300 realistic LLM behavioral policies, grounded in 30 industries; and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Hate Speech and Cyberbullying Detection