Interpretable-by-Design Transformers via Architectural Stream Independence

Clayton Kerce; Alexis Fox

arXiv:2603.07482·cs.LG·March 10, 2026

Interpretable-by-Design Transformers via Architectural Stream Independence

Clayton Kerce, Alexis Fox

PDF

Open Access

TL;DR

This paper proposes an architectural design for transformers called Late Fusion Architecture (LFA) that enforces interpretability by maintaining separate token and semantic streams, leading to more modular, stable, and semantically meaningful models.

Contribution

The paper introduces the LFA model that enforces interpretability through stream independence, validated by new metrics and intervention experiments showing improved modularity and semantic understanding.

Findings

01

LFA maintains interpretable symbolic heads across layers.

02

Interventions on LFA heads cause minimal semantic disruption.

03

LFA achieves higher stability and semantic focus compared to standard transformers.

Abstract

While transformers achieve strong performance, their internal decision-making processes remain opaque. We investigate whether architectural constraints can enforce interpretability by design through architectural stream independence: maintaining a token stream (carrying symbolic structure) and contextual semantics in separated streams that remain independently observable throughout processing, with integration delayed until output. We validate this principle through the Late Fusion Architecture (LFA), which demonstrates interpretable symbolic heads through all the final layers, while standard transformers show dissolution by the third of six layers; we quantify this effect by introducing the Token-Position Dependence Score (PDS), with $P D S_{ma x}$ = 0.276 and 0.058, respectively. Crucially, intervention experiments demonstrate functional modularity: suppressing LFA's recency heads causes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques