BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory
Amirali Aghazadeh, Vipul Gupta, Alex DeWeese, O. Ozan Koyluoglu and, Kannan Ramchandran

TL;DR
BEAR is a second-order feature selection algorithm that uses BFGS with Count Sketch to efficiently handle ultra-high dimensional data in limited memory environments, outperforming first-order methods.
Contribution
It introduces a novel second-order sketching algorithm, BEAR, that reduces memory usage and improves accuracy in ultra-high dimensional feature selection tasks.
Findings
BEAR requires up to 1000x less memory than first-order sketching algorithms.
BEAR achieves comparable classification accuracy with significantly less memory.
Theoretical convergence rate of BEAR is O(1/t).
Abstract
We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order ultra-high dimensional feature selection algorithm, called BEAR, which avoids the extra collisions by storing the second-order gradients in the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, a sublinear memory data structure from the streaming literature. Experiments on real-world data sets demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Machine Learning and ELM
MethodsFeature Selection
