# A genetic algorithm-based framework for online sparse feature selection in data streams

**Authors:** Guanyu Liu, Jinhang Liu, Guifan He, Yifan Liu, Huabo Bai, Min Zhou

PMC · DOI: 10.3389/fdata.2026.1782461 · Frontiers in Big Data · 2026-02-09

## TL;DR

This paper introduces a new method for selecting important features in streaming data with missing values, using a genetic algorithm to improve accuracy.

## Contribution

The novel GA-OS2FS framework combines genetic algorithms and latent factor analysis for better feature selection in data streams.

## Key findings

- GA-OS2FS outperforms existing OSFS and OS2FS methods in accuracy.
- The method effectively handles missing data through latent factor analysis.
- Experiments on six real-world datasets confirm the superiority of GA-OS2FS.

## Abstract

High-dimensional streaming data implementations commonly utilize online streaming feature selection (OSFS) techniques. In practice, however, incomplete data due to equipment failures and technical constraints often poses a significant challenge. Online Sparse Streaming Feature Selection (OS2FS) tackles this issue by performing missing data imputation via latent factor analysis. Nevertheless, existing OS2FS approaches exhibit considerable limitations in feature evaluation, resulting in degraded performance. To address these shortcomings, this paper introduces a novel genetic algorithm-based online sparse streaming feature selection (GA-OS2FS) in data streams, which integrates two key innovations: (1) imputation of missing values using a latent factor analysis model, and (2) application of genetic algorithm to assess feature importance. Comprehensive experiments conducted on six real-world datasets show that GA-OS2FS surpasses state-of-the-art OSFS and OS2FS methods, consistently attaining higher accuracy through the selection of optimal feature subsets.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12926108/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12926108/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/PMC12926108/full.md

---
Source: https://tomesphere.com/paper/PMC12926108