CHOIR: Collaborative Harmonization fOr Inference Robustness
Xiangjue Dong, Cong Wang, Maria Teleki, Millennium Bismay, James Caverlee

TL;DR
CHOIR is a test-time framework that harmonizes multiple persona-conditioned reasoning signals in LLMs, improving inference robustness across demographics without additional training.
Contribution
It introduces CHOIR, a novel method that leverages persona variations as constructive signals to enhance reasoning robustness in LLMs at test time.
Findings
Up to 26.4% performance improvement for individual demographics.
Average 19.2% performance boost across five demographics.
Effective across various models, scales, and tasks without extra training.
Abstract
Persona-assigned Large Language Models (LLMs) can adopt diverse roles, enabling personalized and context-aware reasoning. However, even minor demographic perturbations in personas, such as simple pronoun changes, can alter reasoning trajectories, leading to divergent sets of correct answers. Instead of treating these variations as biases to be mitigated, we explore their potential as a constructive resource to improve reasoning robustness. We propose CHOIR (Collaborative Harmonization fOr Inference Robustness), a test-time framework that harmonizes multiple persona-conditioned reasoning signals into a unified prediction. CHOIR orchestrates a collaborative decoding process among counterfactual personas, dynamically balancing agreement and divergence in their reasoning paths. Experiments on various reasoning benchmarks demonstrate that CHOIR consistently enhances performance across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
