A Geometric Approach to Online Streaming Feature Selection
Salimeh Yasaei Sekeh, Madan Ravi Ganesh, Shurjo Banerjee, Jason J., Corso, and Alfred O. Hero

TL;DR
This paper introduces a geometric approach to online streaming feature selection, addressing unrealistic assumptions in existing methods, and demonstrates improved performance in a new concurrent streaming setting.
Contribution
The paper proposes Geometric Online Adaptation (GOA), a novel feature selection algorithm that reduces comparison steps and uses a bounded geometric dependency measure, outperforming existing methods.
Findings
GOA outperforms SAOLA on various datasets.
GOA remains effective in the OSFS-SS setting.
Fixing feature limits allows fairer algorithm comparison.
Abstract
Online Streaming Feature Selection (OSFS) is a sequential learning problem where individual features across all samples are made available to algorithms in a streaming fashion. In this work, firstly, we assert that OSFS's main assumption of having data from all the samples available at runtime is unrealistic and introduce a new setting where features and samples are streamed concurrently called OSFS with Streaming Samples (OSFS-SS). Secondly, the primary OSFS method, SAOLA utilizes an unbounded mutual information measure and requires multiple comparison steps between the stored and incoming feature sets to evaluate a feature's importance. We introduce Geometric Online Adaption, an algorithm that requires relatively less feature comparison steps and uses a bounded conditional geometric dependency measure. Our algorithm outperforms several OSFS baselines including SAOLA on a variety of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
MethodsFeature Selection
