Internal Flow Signatures for Self-Checking and Refinement in LLMs

Sungheon Jeong; Sanggeon Yun; Ryozo Masukawa; Wenjun Haung; Hanning Chen; Mohsen Imani

arXiv:2602.01897·cs.LG·February 3, 2026

Internal Flow Signatures for Self-Checking and Refinement in LLMs

Sungheon Jeong, Sanggeon Yun, Ryozo Masukawa, Wenjun Haung, Hanning Chen, Mohsen Imani

PDF

Open Access

TL;DR

This paper introduces internal flow signatures that enable LLMs to self-check and refine their outputs by analyzing decision dynamics within the model, improving faithfulness without external verification.

Contribution

It proposes a novel internal monitoring method using flow signatures and a lightweight validator for self-checking and targeted refinement in LLMs.

Findings

01

Effective detection of unfaithful outputs

02

Localization of decision errors within model layers

03

Enabling targeted model refinement

Abstract

Large language models can generate fluent answers that are unfaithful to the provided context, while many safeguards rely on external verification or a separate judge after generation. We introduce \emph{internal flow signatures} that audit decision formation from depthwise dynamics at a fixed inter-block monitoring boundary. The method stabilizes token-wise motion via bias-centered monitoring, then summarizes trajectories in compact \emph{moving} readout-aligned subspaces constructed from the top token and its close competitors within each depth window. Neighboring window frames are aligned by an orthogonal transport, yielding depth-comparable transported step lengths, turning angles, and subspace drift summaries that are invariant to within-window basis choices. A lightweight GRU validator trained on these signatures performs self-checking without modifying the base model. Beyond…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Topic Modeling