Better Learning-Augmented Spanning Tree Algorithms via Metric Forest Completion
Nate Veldt, Thomas Stanley, Benjamin W. Priest, Trevor Steil, Keita Iwabuchi, T.S. Jayram, Grace J. Li, Geoffrey Sanders

TL;DR
This paper introduces a generalized learning-augmented algorithm for approximate minimum spanning trees in metric spaces, improving approximation factors and computational efficiency through strategic edge selection and theoretical analysis.
Contribution
It develops a new method interpolating between existing algorithms, reducing approximation factors for metric MST and providing better instance-specific results.
Findings
Improved approximation factor from 2.62 to 2 for MFC.
Enhanced approximation from (2γ+1) to 2γ for metric MST.
Experimental results validate theoretical improvements.
Abstract
We present improved learning-augmented algorithms for finding an approximate minimum spanning tree (MST) for points in an arbitrary metric space. Our work follows a recent framework called metric forest completion (MFC), where the learned input is a forest that must be given additional edges to form a full spanning tree. Veldt et al. (2025) showed that optimally completing the forest takes time, but designed a 2.62-approximation for MFC with subquadratic complexity. The same method is a -approximation for the original MST problem, where is a quality parameter for the initial forest. We introduce a generalized method that interpolates between this prior algorithm and an optimal -time MFC algorithm. Our approach considers only edges incident to a growing number of strategically chosen ``representative'' points. One corollary of our…
Peer Reviews
Decision·ICLR 2026 Poster
The previous work solved the MFC problem by considering one representative point per partition (in the forest), and then finding the set of edges minimizing the cost. The extension in this work is to instead consider multiple points in each partition. The problem of finding the best such set of points across different partitions is formulated as a variant of the $k-$center problem. The authors then show an extension of the classical 2 approximation $k-$center algorithm for this problem. This in
The idea of using multiple representatives per partition is a natural one. The analysis is clean but the algorithmic contributions are incremental. The authors argue for tightness of their result, but in the hard instance, the number of representative points per partition is 1.
1. Firstly, I appreciate that the paper is well-written and well-organized; I enjoy reading it. The authors provide sufficient intuition for each main theorem, making it easy to understand the main idea behind the improved algorithm. 2. This paper considers an interesting problem and uses a simple idea to get an improved algorithm. The algorithm is easy to implement, and thus, I expect it to have a positive impact in practice. 3. Casting representative selection as a shared-budget multi-instan
1. From the theoretical view, this work looks incremental. The idea of increasing the size of the representative set is standard and appears in many other classical problems (e.g., k-means++ and greedy k-means++). This is a weakness from a purely technical angle, but borrowing techniques from one problem to create a better approximation algorithm sounds like a good approach to me, especially for a non-theory conference. 2. Computing this larger representative set looks time-expensive to me, but
* The proof of Theorem 1 is very clear and simple, and at the same time improves the guarantees by Veldt et al. In my opinion, this is a very nice contribution. * The empirical experiments nicely illustrate how the choices (budget, selection of representatives) in the algorithmic framework impact performance of the algorithm. * The paper studies a well-motivated problem. The framework of learning-augmented algorithms is still mainly studied for online problems: According to the online repository
* The impact of allowing more representatives seems to be mainly studied empirically. It would be nice to have a discussion of theoretical results depending on the budget. * The new analysis of the algorithm does not require many new technical ideas. * For the sake of transparency, it would be nice to already mention in the introduction that the running time depends on the number of components.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Computational Geometry and Mesh Generation · Facility Location and Emergency Management
