Online Feature Selection for Efficient Learning in Networked Systems
Xiaoxuan Wang, Rolf Stadler

TL;DR
This paper introduces OSFS, an online feature selection algorithm that significantly reduces data source requirements while maintaining or improving model accuracy in networked systems, addressing the limitations of offline models.
Contribution
The paper presents OSFS, a novel online feature selection algorithm that efficiently reduces feature set size and adapts to concept drift in networked system data.
Findings
OSFS reduces feature set size by 1-3 orders of magnitude.
Predictor accuracy with OSFS is comparable or better than offline methods.
OSFS is robust to different sample intervals and concept drift.
Abstract
Current AI/ML methods for data-driven engineering use models that are mostly trained offline. Such models can be expensive to build in terms of communication and computing cost, and they rely on data that is collected over extended periods of time. Further, they become out-of-date when changes in the system occur. To address these challenges, we investigate online learning techniques that automatically reduce the number of available data sources for model training. We present an online algorithm called Online Stable Feature Set Algorithm (OSFS), which selects a small feature set from a large number of available data sources after receiving a small number of measurements. The algorithm is initialized with a feature ranking algorithm, a feature set stability metric, and a search policy. We perform an extensive experimental evaluation of this algorithm using traces from an in-house testbed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Air Quality Monitoring and Forecasting
MethodsFeature Selection
