Revisiting mean estimation over $\ell_p$ balls: Is the MLE optimal?
Liviu Aolaritei, Michael I. Jordan, Reese Pathak, Annie Ulichney

TL;DR
This paper analyzes the optimality of the MLE for mean estimation under $\, ext{l}_p$ constraints, revealing its minimax optimality in some regimes and suboptimality in others, with implications for high-dimensional statistics.
Contribution
It characterizes the regimes where the MLE is minimax optimal or suboptimal for $\, ext{l}_p$ constrained mean estimation, providing explicit lower bounds and new bounds for non-convex cases.
Findings
MLE is minimax optimal for $p$ in [0, 1 + Θ(1/ log d)] or ≥ 2.
MLE is suboptimal for $p$ between 1 + Θ(1/ log d) and 2.
When suboptimal, MLE incurs polynomial factor risk in sample size.
Abstract
We revisit the problem of mean estimation in the Gaussian sequence model with constraints for . We demonstrate two phenomena for the behavior of the maximum likelihood estimator (MLE), which depend on the noise level, the radius of the (quasi)norm constraint, the dimension, and the norm index . First, if lies between and , inclusive, or if it is greater than or equal to , the MLE is minimax rate-optimal for all noise levels and all constraint radii. On the other hand, for the remaining norm indices -- namely, if lies between and -- here is a more striking behavior: the MLE is minimax rate-suboptimal, despite its nonlinearity in the observations, for essentially all noise levels and constraint radii for which nonlinear estimates are necessary for minimax-optimal estimation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Distributed Sensor Networks and Detection Algorithms · Sparse and Compressive Sensing Techniques
