Statistical and knowledge supported visualization of multivariate data
Magnus Fontes

TL;DR
This paper introduces accessible statistical and mathematical tools for visualizing and exploring high-dimensional multivariate data, exemplified through genome-wide microarray datasets, aiming to stimulate further research in data exploration methods.
Contribution
It presents a comprehensive, accessible framework combining statistical tools and visualization techniques for multivariate data exploration, with applications to genomic datasets.
Findings
Applied methodology to genome-wide DNA microarray data
Proposed a general exploratory approach for high-dimensional datasets
Highlighted potential for future theoretical developments in data visualization
Abstract
In the present work we have selected a collection of statistical and mathematical tools useful for the exploration of multivariate data and we present them in a form that is meant to be particularly accessible to a classically trained mathematician. We give self contained and streamlined introductions to principal component analysis, multidimensional scaling and statistical hypothesis testing. Within the presented mathematical framework we then propose a general exploratory methodology for the investigation of real world high dimensional datasets that builds on statistical and knowledge supported visualizations. We exemplify the proposed methodology by applying it to several different genomewide DNA-microarray datasets. The exploratory methodology should be seen as an embryo that can be expanded and developed in many directions. As an example we point out some recent promising advances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Gene Regulatory Network Analysis
