# Feature Selection for Data Integration with Mixed Multi-view Data

**Authors:** Yulia Baker, Tiffany M. Tang, Genevera I. Allen

arXiv: 1903.11232 · 2020-01-13

## TL;DR

This paper introduces B-RAIL, a novel feature selection method designed for high-dimensional multi-view data with mixed types, improving data integration and network inference in complex biological datasets.

## Contribution

We develop B-RAIL, a practical, theoretically-guided feature selection method that effectively handles heterogeneity in multi-view data for sparse regression and graph analysis.

## Key findings

- B-RAIL outperforms existing methods in simulations.
- B-RAIL identifies known ovarian cancer biomarkers.
- B-RAIL suggests new potential biomarkers.

## Abstract

Data integration methods that analyze multiple sources of data simultaneously can often provide more holistic insights than can separate inquiries of each data source. Motivated by the advantages of data integration in the era of "big data", we investigate feature selection for high-dimensional multi-view data with mixed data types (e.g. continuous, binary, count-valued). This heterogeneity of multi-view data poses numerous challenges for existing feature selection methods. However, after critically examining these issues through empirical and theoretically-guided lenses, we develop a practical solution, the Block Randomized Adaptive Iterative Lasso (B-RAIL), which combines the strengths of the randomized Lasso, adaptive weighting schemes, and stability selection. B-RAIL serves as a versatile data integration method for sparse regression and graph selection, and we demonstrate the effectiveness of B-RAIL through extensive simulations and a case study to infer the ovarian cancer gene regulatory network. In this case study, B-RAIL successfully identifies well-known biomarkers associated with ovarian cancer and hints at novel candidates for future ovarian cancer research.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.11232/full.md

## Figures

108 figures with captions in the complete paper: https://tomesphere.com/paper/1903.11232/full.md

## References

53 references — full list in the complete paper: https://tomesphere.com/paper/1903.11232/full.md

---
Source: https://tomesphere.com/paper/1903.11232