A network that learns Strassen multiplication
Veit Elser

TL;DR
This paper explores neural networks with only multipliers to discover Strassen's matrix multiplication algorithm, demonstrating the ability to learn low-rank tensor decompositions with minimal weight adjustments.
Contribution
Introduces a conservative learning approach for neural networks to discover Strassen's matrix multiplication rules with limited multipliers.
Findings
Successfully learned low-rank decompositions of matrix multiplication tensors
Requires few thousand examples for $M_2$ and $10^5$ for $M_3$
High precision is essential to distinguish true decompositions
Abstract
We study neural networks whose only non-linear components are multipliers, to test a new training rule in a context where the precise representation of data is paramount. These networks are challenged to discover the rules of matrix multiplication, given many examples. By limiting the number of multipliers, the network is forced to discover the Strassen multiplication rules. This is the mathematical equivalent of finding low rank decompositions of the matrix multiplication tensor, . We train these networks with the conservative learning rule, which makes minimal changes to the weights so as to give the correct output for each input at the time the input-output pair is received. Conservative learning needs a few thousand examples to find the rank 7 decomposition of , and for the rank 23 decomposition of (the lowest known). High precision is critical,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVLSI and FPGA Design Techniques · Cellular Automata and Applications · Low-power high-performance VLSI design
