Aletheia: Quantifying Cognitive Conviction in Reasoning Models via Regularized Inverse Confusion Matrix

Fanzhe Fu

arXiv:2601.01532·cs.AI·January 6, 2026

Aletheia: Quantifying Cognitive Conviction in Reasoning Models via Regularized Inverse Confusion Matrix

Fanzhe Fu

PDF

Open Access

TL;DR

This paper introduces Project Aletheia, a framework for quantifying the depth of belief in reasoning models using a regularized inverse confusion matrix, validated through synthetic data and applied to current AI benchmarks.

Contribution

It extends the CHOKE framework to measure 'Cognitive Conviction' in reasoning models and proposes a novel methodology employing Tikhonov Regularization for this purpose.

Findings

01

Models act as a 'cognitive buffer' against adversarial pressure.

02

Preliminary results indicate models may exhibit 'Defensive OverThinking'.

03

The Aligned Conviction Score helps ensure safety without compromising conviction.

Abstract

In the progressive journey toward Artificial General Intelligence (AGI), current evaluation paradigms face an epistemological crisis. Static benchmarks measure knowledge breadth but fail to quantify the depth of belief. While Simhi et al. (2025) defined the CHOKE phenomenon in standard QA, we extend this framework to quantify "Cognitive Conviction" in System 2 reasoning models. We propose Project Aletheia, a cognitive physics framework that employs Tikhonov Regularization to invert the judge's confusion matrix. To validate this methodology without relying on opaque private data, we implement a Synthetic Proxy Protocol. Our preliminary pilot study on 2025 baselines (e.g., DeepSeek-R1, OpenAI o1) suggests that while reasoning models act as a "cognitive buffer," they may exhibit "Defensive OverThinking" under adversarial pressure. Furthermore, we introduce the Aligned Conviction Score…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Adversarial Robustness in Machine Learning