Identifying statistically significant patterns in gene expression data
Patrick E. McSharry, Edmund J. Crampin

TL;DR
This paper introduces SALT, a statistical method for assessing the significance of gene co-expression patterns in hierarchical clustering, improving the reliability of gene expression analysis.
Contribution
The paper presents SALT, a novel statistical approach for evaluating the significance of clustering results in gene expression data, addressing a gap in current methodologies.
Findings
Complete-linkage clustering is necessary for proper significance analysis.
SALT effectively identifies significant gene co-expression patterns.
Application to real data confirms the method's utility.
Abstract
Motivation: Clustering techniques are routinely applied to identify patterns of co-expression in gene expression data. Co-regulation, and involvement of genes in similar cellular function, is subsequently inferred from the clusters which are obtained. Increasingly sophisticated algorithms have been applied to microarray data, however, less attention has been given to the statistical significance of the results of clustering studies. We present a technique for the analysis of commonly used hierarchical linkage-based clustering called Significance Analysis of Linkage Trees (SALT). Results: The statistical significance of pairwise similarity levels between gene expression profiles, a measure of co-expression, is established using a surrogate data analysis method. We find that a modified version of the standard linkage technique, complete-linkage, must be used to generate hierarchical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Evolutionary Algorithms and Applications · Algorithms and Data Compression
