ActivationReasoning: Logical Reasoning in Latent Activation Spaces

Lukas Helff; Ruben H\"arle; Wolfgang Stammer; Felix Friedrich; Manuel Brack; Antonia W\"ust; Hikaru Shindo; Patrick Schramowski; Kristian Kersting

arXiv:2510.18184·cs.LG·May 12, 2026

ActivationReasoning: Logical Reasoning in Latent Activation Spaces

Lukas Helff, Ruben H\"arle, Wolfgang Stammer, Felix Friedrich, Manuel Brack, Antonia W\"ust, Hikaru Shindo, Patrick Schramowski, Kristian Kersting

PDF

1 Datasets 1 Video

TL;DR

ActivationReasoning introduces a framework embedding explicit logical reasoning into LLM latent spaces, enhancing interpretability, control, and reasoning capabilities across diverse tasks.

Contribution

It presents a novel method to incorporate logical reasoning into LLMs' latent representations, enabling systematic reasoning and model control.

Findings

01

AR scales with reasoning complexity and generalizes well.

02

It improves transparency and enables structured reasoning.

03

AR transfers effectively across different model backbones.

Abstract

Large language models (LLMs) excel at generating fluent text, but their internal reasoning remains opaque and difficult to control. Sparse autoencoders (SAEs) make hidden activations more interpretable by exposing latent features that often align with human concepts. Yet, these features are fragile and passive, offering no mechanism for systematic reasoning or model control. To address this, we introduce ActivationReasoning (AR), a framework that embeds explicit logical reasoning into the latent space of LLMs. It proceeds in three stages: (1) Finding latent representations, first latent concept representations are identified (e.g., via SAEs) and organized into a dictionary; (2) Activating propositions, at inference time AR detects activating concepts and maps them to logical propositions; and (3)Logical reasoning, applying logical rules over these propositions to infer higher-order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

AIML-TUDA/Rail2Country
dataset· 106 dl
106 dl

Videos

ActivationReasoning: Logical Reasoning in Latent Activation Spaces· slideslive