# Robust Regression via Online Feature Selection under Adversarial Data   Corruption

**Authors:** Xuchao Zhang, Shuo Lei, Liang Zhao, Arnold P. Boedihardjo, Chang-Tien, Lu

arXiv: 1902.01729 · 2019-02-06

## TL;DR

This paper introduces RoOFS, a robust online feature selection algorithm that effectively learns reliable regression coefficients from streaming, partially accessible, and corrupted data, addressing scalability and adversarial challenges.

## Contribution

The paper presents a novel algorithm that simultaneously handles corrupted data estimation, online feature selection, and scalability in streaming environments.

## Key findings

- Outperforms existing methods in feature recovery accuracy
- Provides theoretical error bounds for the algorithm
- Demonstrates effectiveness on synthetic and real-world datasets

## Abstract

The presence of data corruption in user-generated streaming data, such as social media, motivates a new fundamental problem that learns reliable regression coefficient when features are not accessible entirely at one time. Until now, several important challenges still cannot be handled concurrently: 1) corrupted data estimation when only partial features are accessible; 2) online feature selection when data contains adversarial corruption; and 3) scaling to a massive dataset. This paper proposes a novel RObust regression algorithm via Online Feature Selection (\textit{RoOFS}) that concurrently addresses all the above challenges. Specifically, the algorithm iteratively updates the regression coefficients and the uncorrupted set via a robust online feature substitution method. We also prove that our algorithm has a restricted error bound compared to the optimal solution. Extensive empirical experiments in both synthetic and real-world datasets demonstrated that the effectiveness of our new method is superior to that of existing methods in the recovery of both feature selection and regression coefficients, with very competitive efficiency.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.01729/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1902.01729/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1902.01729/full.md

---
Source: https://tomesphere.com/paper/1902.01729