Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices
Yudong Chen, Jiaming Xu

TL;DR
This paper explores the fundamental limits and computational challenges of recovering planted clusters and submatrices in large random structures, revealing a layered complexity landscape and the gap between statistical optimality and efficient algorithms.
Contribution
It characterizes the phase diagram of statistical and computational regimes for planted problems with many clusters, establishing tight recovery limits and demonstrating algorithmic failures in harder regimes.
Findings
Identifies four regimes: impossible, hard, easy, simple.
Proves polynomial-time algorithms cannot achieve the minimax limit in some regimes.
Provides tight bounds on the recovery thresholds for growing number of clusters/submatrices.
Abstract
We consider two closely related problems: planted clustering and submatrix localization. The planted clustering problem assumes that a random graph is generated based on some underlying clusters of the nodes; the task is to recover these clusters given the graph. The submatrix localization problem concerns locating hidden submatrices with elevated means inside a large real-valued random matrix. Of particular interest is the setting where the number of clusters/submatrices is allowed to grow unbounded with the problem size. These formulations cover several classical models such as planted clique, planted densest subgraph, planted partition, planted coloring, and stochastic block model, which are widely used for studying community detection and clustering/bi-clustering. For both problems, we show that the space of the model parameters (cluster/submatrix size, cluster density, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Facility Location and Emergency Management · Random Matrices and Applications
