Subspace exploration: Bounds on Projected Frequency Estimation
Graham Cormode, Charlie Dickens, David P. Woodruff

TL;DR
This paper investigates the space complexity of computing data analysis functions over subspaces of high-dimensional data, establishing lower bounds and proposing upper bounds that improve upon naive approaches.
Contribution
It provides new lower bounds and space-approximation tradeoffs for subspace frequency estimation problems, using coding theory and combinatorial reductions.
Findings
Many subspace analysis problems require exponential space in the dimension.
Efficient approximations are achievable with sub-exponential space complexity.
The results demonstrate significant improvements over naive methods for high-dimensional data analysis.
Abstract
Given an dimensional dataset , a projection query specifies a subset of columns which yields a new array. We study the space complexity of computing data analysis functions over such subspaces, including heavy hitters and norms, when the subspaces are revealed only after observing the data. We show that this important class of problems is typically hard: for many problems, we show lower bounds. However, we present upper bounds which demonstrate space dependency better than . That is, for and a parameter an -approximation can be obtained in space , showing that it is possible to improve on the na\"{i}ve approach of keeping information for all subsets of columns. Our results are based on careful constructions of instances using coding theory and novel combinatorial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTarget Tracking and Data Fusion in Sensor Networks · GNSS positioning and interference · Speech and Audio Processing
