Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

Paolo Andreini; Alessandra Bernardi; Monica Bianchini; Barbara Toniella Corradini; Sara Marziali; Giacomo Nunziati; Franco Scarselli

arXiv:2602.21797·math.AG·February 26, 2026

Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

Paolo Andreini, Alessandra Bernardi, Monica Bianchini, Barbara Toniella Corradini, Sara Marziali, Giacomo Nunziati, Franco Scarselli

PDF

Open Access

TL;DR

This paper introduces StrassenNet, a neural architecture that learns fast matrix multiplication algorithms, successfully recovering known algorithms for 2x2 and providing insights into the minimal rank for 3x3 multiplication.

Contribution

The paper presents a neural network approach to discover and analyze low-rank tensor decompositions for fast matrix multiplication, including recovering Strassen's algorithm and exploring minimal ranks.

Findings

01

Successfully reproduces Strassen's algorithm for 2x2 multiplication.

02

Identifies a numerical threshold at rank 23 for 3x3 multiplication.

03

Preliminary results on border-rank decompositions align with known bounds.

Abstract

Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2 \times 2$ multiplication. Across many independent runs the network always converges to a rank- $7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3 \times 3$ multiplication with rank $r \in {19, \dots, 23}$ . Our experiments reveal a clear numerical threshold: models with $r = 23$ attain significantly lower validation error than those with $r \leq 22$ , suggesting that $r = 23$ could actually be the smallest effective rank of the matrix multiplication tensor $3 \times 3$ . We also sketch an extension of the method to border-rank decompositions via an $ε$ --parametrisation and report preliminary results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks