Regression-based heterogeneity analysis to identify overlapping subgroup   structure in high-dimensional data

Ziye Luo; Xinyue Yao; Yifan Sun; Xinyan Fan

arXiv:2211.15152·stat.ME·November 29, 2022

Regression-based heterogeneity analysis to identify overlapping subgroup structure in high-dimensional data

Ziye Luo, Xinyue Yao, Yifan Sun, Xinyan Fan

PDF

1 Repo

TL;DR

This paper introduces a novel regression-based heterogeneity analysis method that identifies overlapping subgroups in high-dimensional data, improving understanding of complex diseases and biological functions.

Contribution

It develops a new approach that accounts for overlapping subgroups and high data dimensions, with regularization techniques for feature selection and stability.

Findings

01

Outperforms existing methods in simulations

02

Successfully identifies overlapping subgroups in cancer data

03

Provides stable and accurate subgroup structures

Abstract

Heterogeneity is a hallmark of complex diseases. Regression-based heterogeneity analysis, which is directly concerned with outcome-feature relationships, has led to a deeper understanding of disease biology. Such an analysis identifies the underlying subgroup structure and estimates the subgroup-specific regression coefficients. However, most of the existing regression-based heterogeneity analyses can only address disjoint subgroups; that is, each sample is assigned to only one subgroup. In reality, some samples have multiple labels, for example, many genes have several biological functions, and some cells of pure cell types transition into other types over time, which suggest that their outcome-feature relationships (regression coefficients) can be a mixture of relationships in more than one subgroups, and as a result, the disjoint subgrouping results can be unsatisfactory. To this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

foliag/subgroup
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.