Sparse Rank Regression for Restricted-Access Economic Data
Wen Zhang, Songshan Yang, Liping Zhu

TL;DR
This paper introduces a distributed sparse rank regression method tailored for restricted-access economic data with heavy-tailed outcomes, providing theoretical guarantees and practical improvements over naive approaches.
Contribution
It develops the distributed convoluted rank regression (DCRR), a novel surrogate for heavy-tailed, restricted data, with proven statistical properties and superior empirical performance.
Findings
DCRR closely approximates pooled CRR in simulations.
The method outperforms naive divide-and-conquer approaches.
Theoretical guarantees include error bounds and model selection consistency.
Abstract
Empirical research in economics increasingly relies on restricted-access data held by multiple firms or agencies, making it impossible to construct the estimator of interest on the pooled sample. At the same time, heavy-tailed distributions are pervasive in economics and finance outcomes such as prices, expenditures and loan sizes. We study sparse, robust estimation in the restricted-access setting. The infeasible pooled benchmark is convoluted rank regression (CRR), a smooth rank-based estimator designed for heavy-tailed outcomes. Because the CRR criterion is a non-additive U-statistic, existing communication-efficient methods built for additive empirical losses do not directly apply. We propose distributed convoluted rank regression (DCRR), a surrogate criterion built from a single local CRR loss and an aggregated gradient correction, and show that it shares the same population…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
