Mechanistic Interpretability for Neural TSP Solvers

Reuben Narad; Leonard Boussioux; Michael Wagner

arXiv:2510.21693·cs.LG·October 27, 2025

Mechanistic Interpretability for Neural TSP Solvers

Reuben Narad, Leonard Boussioux, Michael Wagner

PDF

TL;DR

This paper uses mechanistic interpretability techniques to analyze Transformer-based neural TSP solvers, revealing that they develop geometric features like boundary detectors and cluster-sensitive responses, enhancing understanding of their decision-making processes.

Contribution

It introduces the first application of activation-based interpretability methods to neural TSP models, uncovering geometric features learned by the network without explicit supervision.

Findings

01

Neural TSP solvers develop boundary and cluster features.

02

Geometric structures emerge naturally in the model.

03

Provides insights into the internal representations of neural TSP solutions.

Abstract

Neural networks have advanced combinatorial optimization, with Transformer-based solvers achieving near-optimal solutions on the Traveling Salesman Problem (TSP) in milliseconds. However, these models operate as black boxes, providing no insight into the geometric patterns they learn or the heuristics they employ during tour construction. We address this opacity by applying sparse autoencoders (SAEs), a mechanistic interpretability technique, to a Transformer-based TSP solver, representing the first application of activation-based interpretability methods to operations research models. We train a pointer network with reinforcement learning on 100-node instances, then fit an SAE to the encoder's residual stream to discover an overcomplete dictionary of interpretable features. Our analysis reveals that the solver naturally develops features mirroring fundamental TSP concepts: boundary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.