What is my math transformer doing? -- Three results on interpretability   and generalization

Fran\c{c}ois Charton

arXiv:2211.00170·cs.LG·November 2, 2022

What is my math transformer doing? -- Three results on interpretability and generalization

Fran\c{c}ois Charton

PDF

Open Access 1 Repo

TL;DR

This paper explores the interpretability and generalization of math transformers trained on matrix problems, revealing they often retain mathematical properties even when failing and can generalize beyond training data with proper dataset choices.

Contribution

It demonstrates that math transformers preserve mathematical properties in failures and that dataset design influences training speed and out-of-distribution generalization.

Findings

01

Incorrect predictions often retain mathematical properties

02

Failures can be predicted from problem properties

03

Careful dataset choice improves training and generalization

Abstract

This paper investigates the failure cases and out-of-distribution behavior of transformers trained on matrix inversion and eigenvalue decomposition. I show that incorrect model predictions still retain deep mathematical properties of the solution (e.g. correct eigenvalues, unit norm of eigenvectors), and that almost all model failures can be attributed to, and predicted from, properties of the problem or solution. This demonstrates that, when in doubt, math transformers do not hallucinate absurd solutions (as was sometimes proposed) but remain ``roughly right''. I also show that the careful choice of a training dataset can accelerate training, while allowing the model to generalize out of its training distribution, invalidating the idea that transformers ``merely interpolate'' from memorized examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/lawt
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Machine Learning and Data Classification