High-Dimensional Hettmansperger-Randles Estimator and its Applications

Guowei Yan; Long Feng; Xiaoxu Zhang

arXiv:2505.01669·stat.ME·May 6, 2025

High-Dimensional Hettmansperger-Randles Estimator and its Applications

Guowei Yan, Long Feng, Xiaoxu Zhang

PDF

Open Access 4 Reviews

TL;DR

This paper introduces a high-dimensional version of the Hettmansperger-Randles Estimator for robust inference on location and scatter in high-dimensional data, demonstrating its effectiveness through simulations and real data.

Contribution

It proposes a novel high-dimensional Hettmansperger-Randles Estimator and applies it to testing and classification problems, extending its applicability to high-dimensional settings.

Findings

01

High effectiveness across various distributions

02

Superior performance in simulations

03

Improved results in real-data applications

Abstract

The classic Hettmansperger-Randles Estimator has found extensive use in robust statistical inference. However, it cannot be directly applied to high-dimensional data. In this paper, we propose a high-dimensional Hettmansperger-Randles Estimator for the location parameter and scatter matrix of elliptical distributions in high-dimensional scenarios. Subsequently, we apply these estimators to two prominent problems: the one-sample location test problem and quadratic discriminant analysis. We discover that the corresponding new methods exhibit high effectiveness across a broad range of distributions. Both simulation studies and real-data applications further illustrate the superiority of the newly proposed methods.

Peer Reviews

Decision·ICLR 2026 Conference Desk Rejected Submission

Reviewer 01Rating 6Confidence 3

Strengths

1, The generalization of the HR estimator to high dimensional data seems to be novel and the proposed banded HR update (Algorithm 2) is sound and computationally implementable. 2, The authors establish rigorous theoretical development. They proved the Bahadur representation for the HR location estimator, the Gaussian approximation over convex sets, and the asymptotic independence between $T_{sum}$ and $T_{max}$, justifying the Cauchy combination test.

Weaknesses

1, the method lacks theoretical guidance on the selection of bandwidth. 2, All results rely on the elliptical model. Simulation study lacks sensitivity analysis beyond elliptical data. 3, Competing robust methods (e.g., robust covariance shrinkage) are not included in experiments. Including at least one such baseline would make the empirical evaluation more comprehensive.

Reviewer 02Rating 2Confidence 4

Strengths

* While I could not check the entire proof, I tried to check the Lemma 1 (which I believe is the most technical ingredient), and it seems correct (if my questions are resolved. See the question section). I think theoretical analyses conducted in this paper are nontrivial. * Given the concern on Assumption 1 is resolved (see the weakness section), the ARE calculation shows that the method behaves better than standard methods under the heavy tail distribution as claimed. In this regard, the propo

Weaknesses

* The underlying assumptions are highly restrictive, and justifications are missing. I believe the assumptions are unlikely to hold for any practical problems. * Precisely, I conjecture Assumption 1--specifically sub-Gaussian condition--fails for almost every case. Here is the heuristic reasoning. Let’s assume $\zeta_1^{-1}$ exists by a constant (which seems like a weak condition in high dimensional setting). Then, $P(\zeta^{-1}r^{-1} > t) = P(r < t^{-1}\zeta^{-1})$. Since we are looking at

Reviewer 03Rating 6Confidence 3

Strengths

The paper extends the Hettmansperger-Randles (HR) estimator to high-dimensional settings so that location and scatter can be estimated jointly in a robust manner while preserving affine equivariance (invariance under linear transformations). This provides a practical and principled alternative in cases where Tyler's estimator tends to be ill-defined when $p>n$. The authors establish a Gaussian approximation for standardized statistics and show that sum-type ($L_2$) and max-type ($L_{\infty}$) s

Weaknesses

**Dependence on elliptical distributions (explicitly acknowledged by the authors)**: The main results rely substantially on the elliptical family—specifically the approximation (p^{-1}\hat S \approx I_p) that motivates Step 3 banding in the high-dimensional HR algorithm (Sec. 2; Algorithm 2; definition of B_h). As explicitly noted by the authors (Conclusion), relaxing this assumption is left for future work. The manuscript does not yet articulate minimal conditions under which the Gaussian a

Reviewer 04Rating 4Confidence 3

Strengths

* **Solid Theoretical Foundation:** The work is built upon the well-established HR estimator framework, which is affine equivariant and offers robustness (Elliptical Distribution). * **Relevant High-Dimensional Applications:** Applying this robust estimator to fundamental high-dimensional problems like mean testing and QDA is highly relevant and demonstrates practical value. * **Comprehensive Evaluation:** The paper includes thorough theoretical analysis (asymptotic theory) and empirical v

Weaknesses

* **Algorithmic Description Lacks Rigor:** The description of the algorithms is not precise enough. Key details are missing, making it difficult to understand or reproduce the methods. For instance: * It is unclear how $\widehat{\Sigma}^{-1/2}$ is computed. If a method like the Sparse Graphical Lasso (SGLASSO) is used, the process for hyperparameter tuning is not mentioned. * **Unverified Assumptions:** The methodology seems to rely on the assumption that the elliptical distribution is

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExperimental and Theoretical Physics Studies