Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

Sandeep Silwal; David P. Woodruff; Qiuyi Zhang

arXiv:2502.19865·cs.DS·February 28, 2025

Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

Sandeep Silwal, David P. Woodruff, Qiuyi Zhang

PDF

Open Access 3 Reviews

TL;DR

This paper investigates the limits of dimensionality reduction for sparse vectors, providing lower bounds for average-case guarantees, and introduces efficient embeddings for non-negative sparse vectors that preserve distances with fewer dimensions.

Contribution

It establishes optimal lower bounds for oblivious linear maps in average-case scenarios and proposes new non-linear embeddings for non-negative sparse vectors with improved dimension bounds.

Findings

01

Lower bounds match upper bounds for oblivious linear maps.

02

Non-negative sparse vectors can be embedded into fewer dimensions while preserving distances.

03

Exact embedding for $ ext{l}_ ext{infinity}$ norm is achievable with tight bounds.

Abstract

We study beyond worst-case dimensionality reduction for $s$ -sparse vectors. Our work is divided into two parts, each focusing on a different facet of beyond worst-case analysis: We first consider average-case guarantees. A folklore upper bound based on the birthday-paradox states: For any collection $X$ of $s$ -sparse vectors in $R^{d}$ , there exists a linear map to $R^{O (s^{2})}$ which \emph{exactly} preserves the norm of $99%$ of the vectors in $X$ in any $ℓ_{p}$ norm (as opposed to the usual setting where guarantees hold for all vectors). We give lower bounds showing that this is indeed optimal in many settings: any oblivious linear map satisfying similar average-case guarantees must map to $Ω (s^{2})$ dimensions. The same lower bound also holds for a wide class of smooth maps, including `encoder-decoder schemes', where we compare the norm of the original vector to…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- The paper is very well written and the authors do a thorough study of dimension reduction for sparse and non-negative sparse vectors. - While the proof is not hard, the fact that the lower bound matches the birthday embedding is interesting. It is also interesting that the lower bounds extend to non-linear but smooth embeddings. - The upper bound also felt interesting, but perhaps the authors can clarify things a bit more (see questions below).

Weaknesses

- The results are nice, but it does feel like a pure theory paper, raising the question of whether ICLR is the right venue. That said, dimension reduction is ubiquitous in ML, and so I feel it is justified. - The setting considered in the lower bounds and upper bounds are quite different -- lower bounds work only for linear and smooth embeddings while the upper bound uses a non-linear embedding. So stating the results as a "separation" between general and non-negative results does not seem conv

Reviewer 02Rating 6Confidence 3

Strengths

The presentation of this paper is generally clear. Readers from different backgrounds should be able to follow the main idea of this paper. For the second result, the map constructed by the authors seems to be novel and interesting. Researchers who work on this area may be able to learn the insight from this construction.

Weaknesses

Regarding the first result, it seems to be natural that one should expect a lower bound of $s^2$ because of the birthday paradox. Indeed, the main analysis of the lower bound is mostly a probability calculation and the construction is based on uniform sampling which is not particularly surprising. I am not quite able to see new techniques introduced in this part.

Reviewer 03Rating 6Confidence 3

Strengths

Apart from the interesting hard cases, their results on nonlinear mappings for non negative vectors, which provide guarantees for preserving $\ell_\infty$ distances, highlight the importance of moving beyond linear i.i.d. mappings for certain problems. Their overall theory results are good and I recommend an accept for this venue.

Weaknesses

I don't recommend a strong accept mainly because of lack of a cohesive insight across the two main research questions they address -- average case lower bounds for general sparse vectors and improved upper bounds for non negative sparse vectors. The paper's array of results could also be supplemented with more insight into the hard distributions (of the type $Unif_{t,r}$) that they constructed. They also mention that Birthday Paradox like maps are folklore, but it would provide more completene

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs