Do You Trust Me? Cognitive-Affective Signatures of Trustworthiness in Large Language Models

Gerard Yeo; Svetlana Churina; Kokil Jaidka

arXiv:2601.10719·cs.AI·January 19, 2026

Do You Trust Me? Cognitive-Affective Signatures of Trustworthiness in Large Language Models

Gerard Yeo, Svetlana Churina, Kokil Jaidka

PDF

Open Access

TL;DR

This study investigates how large language models encode perceived trustworthiness, revealing that they implicitly internalize trust signals related to fairness, certainty, and accountability without explicit supervision.

Contribution

It demonstrates that instruction-tuned LLMs encode psychologically grounded trust signals internally, providing a foundation for developing more credible and transparent AI systems.

Findings

01

Trust cues are implicitly encoded during pretraining.

02

Linearly decodable trust signals are present in model activations.

03

Fine-tuning refines trust representations without restructuring them.

Abstract

Perceived trustworthiness underpins how users navigate online information, yet it remains unclear whether large language models (LLMs),increasingly embedded in search, recommendation, and conversational systems, represent this construct in psychologically coherent ways. We analyze how instruction-tuned LLMs (Llama 3.1 8B, Qwen 2.5 7B, Mistral 7B) encode perceived trustworthiness in web-like narratives using the PEACE-Reviews dataset annotated for cognitive appraisals, emotions, and behavioral intentions. Across models, systematic layer- and head-level activation differences distinguish high- from low-trust texts, revealing that trust cues are implicitly encoded during pretraining. Probing analyses show linearly de-codable trust signals and fine-tuning effects that refine rather than restructure these representations. Strongest associations emerge with appraisals of fairness, certainty,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Ethics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education