CBMAS: Cognitive Behavioral Modeling via Activation Steering

Ahmed H. Ismail; Anthony Kuang; Ayo Akinkugbe; Kevin Zhu; Sean O'Brien

arXiv:2601.06109·cs.AI·January 13, 2026

CBMAS: Cognitive Behavioral Modeling via Activation Steering

Ahmed H. Ismail, Anthony Kuang, Ayo Akinkugbe, Kevin Zhu, Sean O'Brien

PDF

Open Access

TL;DR

CBMAS introduces a diagnostic framework that uses continuous activation steering to analyze and interpret cognitive behaviors in large language models, revealing tipping points and behavioral trajectories.

Contribution

The paper presents a novel continuous diagnostic approach combining activation steering, bias curves, and sensitivity analysis to understand LLM cognitive behaviors.

Findings

01

Reveals tipping points where small interventions flip model behavior

02

Shows how steering effects evolve across model layers

03

Provides tools and datasets for cognitive behavior analysis

Abstract

Large language models (LLMs) often encode cognitive behaviors unpredictably across prompts, layers, and contexts, making them difficult to diagnose and control. We present CBMAS, a diagnostic framework for continuous activation steering, which extends cognitive bias analysis from discrete before/after interventions to interpretable trajectories. By combining steering vector construction with dense {\alpha}-sweeps, logit lens-based bias curves, and layer-site sensitivity analysis, our approach can reveal tipping points where small intervention strengths flip model behavior and show how steering effects evolve across layer depth. We argue that these continuous diagnostics offer a bridge between high-level behavioral evaluation and low-level representational dynamics, contributing to the cognitive interpretability of LLMs. Lastly, we provide a CLI and datasets for various cognitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare