Weights to Code: Extracting Interpretable Algorithms from the Discrete Transformer

Yifan Zhang; Wei Bi; Kechi Zhang; Dongming Jin; Jie Fu; Zhi Jin

arXiv:2601.05770·cs.LG·March 20, 2026

Weights to Code: Extracting Interpretable Algorithms from the Discrete Transformer

Yifan Zhang, Wei Bi, Kechi Zhang, Dongming Jin, Jie Fu, Zhi Jin

PDF

Open Access

TL;DR

This paper introduces the Discrete Transformer, a novel architecture that enables extraction of interpretable, symbolic algorithms from Transformer models by injecting discreteness, achieving comparable performance to RNNs while enhancing interpretability.

Contribution

The paper presents the Discrete Transformer, a new architecture that bridges continuous representations and symbolic logic, facilitating de novo algorithm discovery and interpretability.

Findings

01

Achieves performance comparable to RNN-based methods.

02

Effectively extracts human-readable programs from models.

03

Demonstrates clear exploration-to-exploitation dynamics.

Abstract

Algorithm extraction aims to synthesize executable programs directly from models trained on algorithmic tasks, enabling de novo algorithm discovery without relying on human-written code. However, applying this paradigm to Transformer is hindered by representation entanglement (e.g., superposition), where entangled features encoded in overlapping directions obstruct the recovery of symbolic expressions. We propose the Discrete Transformer, an architecture explicitly designed to bridge the gap between continuous representations and discrete symbolic logic. By injecting discreteness through temperature-annealed sampling, our framework effectively leverages hypothesis testing and symbolic regression to extract human-readable programs. Empirically, the Discrete Transformer achieves performance comparable to RNN-based methods while extending interpretability to continuous variable domains,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Explainable Artificial Intelligence (XAI) · Software Engineering Research