Post-training makes large language models less human-like

Marcel Binz; Elif Akata; Abdullah Almaatouq; Mohammed Alsobay; Oleksii Ariasov; Franziska Br\"andle; David Broska; Jason W. Burton; Nuno Busch; Frederick Callaway; Vanessa Cheung; Brian Christian; Julian Coda-Forno; Can Demircan; Vittoria Dentella; Maria K. Eckstein; No\'emi \'Eltet\H{o}; Michael Franke; Thomas L. Griffiths; Fritz G\"unther; Susanne Haridi; Sebastian Hellmann; Stefan Herytash; Linus Hof; Eleanor Holton; Isabelle Hoxha; Zak Hussain; Akshay Jagadish; Elif Kara; Valentin Kriegmair; Evelina Leivada; Li Ji-An; Tobias Ludwig; Maximilian Maier; Marcelo G. Mattar; Marvin Mathony; Alireza Modirshanechi; Robin Na; Mariia Nadverniuk; Antonios Nasioulas; Surabhi S. Nath; Helen Niemeyer; Kate Nussenbaum; Sebastian Olschewski; Thorsten Pachur; Stefano Palminteri; Aliona Petrenco; Camille V. Phaneuf-Hadd; Angelo Pirrone; Manuel Rausch; Laura Raveling; Shashank Reddy; Milena Rmus; Evan M. Russek; Tankred Saanum; Kai Sandbrink; Louis Schiekiera; Johannes A. Schubert; Luca M. Schulze Buschoff; Nishad Singhi; Leah H. Somerville; Mikhail S. Spektor; Xin Sui; Christopher Summerfield; Mirko Thalmann; Anna I. Thoma; Taisiia Tikhomirova; Vuong Truong; Polina Tsvilodub; Konstantinos Voudouris; Robert C. Wilson; Kristin Witte; Shuchen Wu; Dirk U. Wulff; Hua-Dong Xiong; Songlin Xu; Lance Ying; Xinyu Zhang; Jian-Qiao Zhu; and Eric Schulz

arXiv:2605.07632·cs.CL·May 11, 2026

Post-training makes large language models less human-like

Marcel Binz, Elif Akata, Abdullah Almaatouq, Mohammed Alsobay, Oleksii Ariasov, Franziska Br\"andle, David Broska, Jason W. Burton, Nuno Busch, Frederick Callaway, Vanessa Cheung, Brian Christian, Julian Coda-Forno, Can Demircan, Vittoria Dentella, Maria K. Eckstein

PDF

TL;DR

Post-training processes in large language models decrease their alignment with human behavior, and persona-induction techniques do not improve individual-level predictions, indicating a trade-off between utility and human-likeness.

Contribution

This study introduces Psych-201, a new dataset for measuring behavioral alignment, and reveals that post-training reduces human-likeness across models, with persona-induction being ineffective at the individual level.

Findings

01

Post-training reduces alignment with human behavior across models.

02

Newer models show increased misalignment despite improvements in base models.

03

Persona-induction does not enhance individual-level human-like predictions.

Abstract

Large language models (LLMs) are increasingly used as surrogates for human participants, but it remains unclear which models best capture human behavior and why. To address this, we introduce Psych-201, a novel dataset that enables us to measure behavioral alignment at scale. We find that post-training -- the stage that turns base models into useful assistants -- consistently reduces alignment with human behavior across model families, sizes, and objectives. Moreover, this misalignment widens in newer model generations even as base models continue to improve. Finally, we find that persona-induction -- a popular technique for eliciting human-like behavior by conditioning models on participant-specific information -- does not improve predictions at the level of individuals. Taken together, our results suggest that the very processes that are currently employed to turn LLMs into useful…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.