Transcription factor binding site prediction with multivariate gene expression data
Nancy R. Zhang, Mary C. Wildermuth, Terence P. Speed

TL;DR
This paper introduces a flexible joint modeling approach for promoter sequences and multivariate gene expression data to improve the prediction of transcription factor binding sites, capturing combinatorial and spacing effects in multi-sample microarray experiments.
Contribution
It presents a novel adaptive modeling method that captures spacing-dependent regulatory modules and applies it successfully to yeast and Arabidopsis time-course data.
Findings
Successfully identified known cis-acting elements
Predicted novel regulatory elements
Captured spacing effects in transcription factor binding
Abstract
Multi-sample microarray experiments have become a standard experimental method for studying biological systems. A frequent goal in such studies is to unravel the regulatory relationships between genes. During the last few years, regression models have been proposed for the de novo discovery of cis-acting regulatory sequences using gene expression data. However, when applied to multi-sample experiments, existing regression based methods model each individual sample separately. To better capture the dynamic relationships in multi-sample microarray experiments, we propose a flexible method for the joint modeling of promoter sequence and multivariate expression data. In higher order eukaryotic genomes expression regulation usually involves combinatorial interaction between several transcription factors. Experiments have shown that spacing between transcription factor binding sites can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
