MatRL: Provably Generalizable Iterative Algorithm Discovery via Monte-Carlo Tree Search
Sungyoon Kim, Rajat Vadiraj Dwaraknath, Longling geng, Mert Pilanci

TL;DR
MatRL is a reinforcement learning framework that automatically discovers and plans iterative algorithms for matrix functions, ensuring provable generalization and improved performance over existing methods.
Contribution
The paper introduces MatRL, a novel RL-based approach that automates the design of matrix iteration algorithms with theoretical guarantees of generalization.
Findings
MatRL outperforms baseline algorithms in numerical experiments.
Learned algorithms generalize to larger matrices from the same distribution.
The approach effectively combines Monte-Carlo tree search with reinforcement learning.
Abstract
Iterative methods for computing matrix functions have been extensively studied and their convergence speed can be significantly improved with the right tuning of parameters and by mixing different iteration types. Handtuning the design options for optimal performance can be cumbersome, especially in modern computing environments: numerous different classical iterations and their variants exist, each with non-trivial per-step cost and tuning parameters. To this end, we propose MatRL -- a reinforcement learning based framework that automatically discovers iterative algorithms for computing matrix functions. The key idea is to treat algorithm design as a sequential decision-making process. Monte-Carlo tree search is then used to plan a hybrid sequence of matrix iterations and step sizes, tailored to a specific input matrix distribution and computing environment. Moreover, we also show that…
Peer Reviews
Decision·Submitted to ICLR 2026
The paper is interesting. I'm pretty borderline on it because I'm not super clear how much "new science" there is in it; though if the paper is considered to be sufficiently novel then I don't really have any other concerns about its publication. I'm a bit of an outsider to RL as a community, so it's hard for me to judge how much this paper is really cool RL research (as opposed to like good engineering on top of standard RL tools). The motivation is compelling. We want to compute spectral fun
The paper is confusing in places. The experiments are still weaker than I'd want them to be, in terms of validating a proposed mathematical algorithm on a wide variety of matrices. I don't really understand all the twists and turns in the precise engineering of the titular MatRL algorithm [page 6]. The pseudocode is a bit too abstract, and the importance of all the elements feels a bit vague to me. I wouldn't be surprised this feels vague to me because I'm not as familiar with the RL side of th
- The RL algorithm managed to identify algorithms with better performance on the examples presented in the paper.
The training is applicable only when the state is reduced to the spectrum of the matrix and to functions within the Congruence Invariant Diagonal Preserving framework. This likely limits the general applicability of the resulting algorithms. The results presented in Section 4 on generalization appear rather weak, as they essentially concern only matrices with asymptotically similar spectra. The title therefore seems somewhat optimistic. A deeper investigation of the algorithms’ generalization p
* The paper considers the automated discovery of matrix algorithms, which has the potential to improve computational workloads by finding more efficient algorithms * Numerical experiments suggest that computational improvements can be achieved compared to existing baselines / state-of-the-art algorithms
* The paper is, overall, hard to follow. It is easy to get lost in the various notations, and some terminologies are used without a clear definition * The proof of several key theoretical results is deferred to the appendix, which makes it hard to validate the findings * The experiments appear to be executed on a single matrix (see questions below), and the paper does not appear to discuss performance variability * Proposition 1 appears to be an existential result based what seem to be relativel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFuzzy Logic and Control Systems · Neural Networks and Applications · Time Series Analysis and Forecasting
