Nonparametric Bayesian sparse factor models with application to gene   expression modeling

David Knowles; Zoubin Ghahramani

arXiv:1011.6293·stat.AP·July 29, 2011

Nonparametric Bayesian sparse factor models with application to gene expression modeling

David Knowles, Zoubin Ghahramani

PDF

TL;DR

This paper introduces a nonparametric Bayesian factor analysis model using the Indian Buffet Process to infer sparse, potentially infinite latent factors, demonstrating its effectiveness on gene expression data.

Contribution

It develops a novel nonparametric Bayesian sparse factor model with IBP prior, enabling automatic inference of the number of latent factors in gene expression analysis.

Findings

01

Model successfully infers sparse latent factors.

02

Effective on simulated and real gene expression datasets.

03

Demonstrates flexibility and scalability of the approach.

Abstract

A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $Y$ is modeled as a linear superposition, $G$ , of a potentially infinite number of hidden factors, $X$ . The Indian Buffet Process (IBP) is used as a prior on $G$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.