Generalization and Overfitting in Matrix Product State Machine Learning Architectures
Artem Strashko, E. Miles Stoudenmire

TL;DR
This paper investigates how matrix product state (MPS) architectures generalize and overfit across different data complexities, revealing that overfitting is more prominent in simple data but less so in complex or real-world datasets like MNIST.
Contribution
The study provides empirical insights into the generalization behavior of MPS models, highlighting data-dependent overfitting tendencies and challenging previous assumptions about their monotonic performance improvements.
Findings
Overfitting occurs in 1D synthetic data modeled by MPS.
Less overfitting is observed with complex data like MNIST.
Generalization properties depend on data complexity and compatibility with MPS.
Abstract
While overfitting and, more generally, double descent are ubiquitous in machine learning, increasing the number of parameters of the most widely used tensor network, the matrix product state (MPS), has generally lead to monotonic improvement of test performance in previous studies. To better understand the generalization properties of architectures parameterized by MPS, we construct artificial data which can be exactly modeled by an MPS and train the models with different number of parameters. We observe model overfitting for one-dimensional data, but also find that for more complex data overfitting is less significant, while with MNIST image data we do not find any signatures of overfitting. We speculate that generalization properties of MPS depend on the properties of data: with one-dimensional data (for which the MPS ansatz is the most suitable) MPS is prone to overfitting, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Machine Learning in Materials Science · Graph Theory and Algorithms
MethodsTest
