# Perfect collinearity not created equal: measuring and visualizing the severity of multi-collinearity of modern omics data

**Authors:** Wei Q. Deng, Radu V. Craiu, Lei Sun

PMC · DOI: 10.1515/sagmb-2025-0043 · 2026-02-10

## TL;DR

This paper introduces new methods to measure and visualize multi-collinearity in high-dimensional data, such as genomics, to better understand data redundancy and its impact on statistical analysis.

## Contribution

The paper proposes individualized and global measures to assess and visualize multi-collinearity in high-dimensional omics data.

## Key findings

- New measures can visualize patterns of perfect collinearity in high-dimensional data.
- The measures reveal differences in linkage disequilibrium between sexes on the human X chromosome.
- The methods highlight gene regions with excessive multi-collinearity.

## Abstract

Multi-collinearity frequently occurs in modern statistical applications and when ignored, can negatively impact model selection and statistical inference. Though perfect collinearity is always present in “n < p” data, we demonstrate that perfect collinearity arises differently, from diverse data redundancy patterns and/or data dimensions. Classic tools and measures that were developed for “n > p” data cannot be used to distinguish or visualize these patterns in the high-dimensional regime. Here we propose 1) new individualized measures that can be used to visualize patterns of perfect collinearity, and subsequently 2) global measures to assess the overall burden of multi-collinearity irrespective of data dimensions. We applied these measures to the human X chromosome data to understand similarity and differences in linkage disequilibrium structure due to sex and genetic features. The measures can highlight gene regions of excessive multi-collinearity and contrast the severity of perfect collinearity between different sexes. Utility of these measures to high-dimensional statistical application were also discussed.

## Linked entities

- **Species:** Homo sapiens (taxon 9606)

## Full-text entities

- **Genes:** sRs [NCBI Gene 140821], HLA-A (major histocompatibility complex, class I, A) [NCBI Gene 3105] {aka HLAA}, TCF20 (transcription factor 20) [NCBI Gene 6942] {aka AR1, DDVIBA, SPBP, TCF-20}
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12909097/full.md

---
Source: https://tomesphere.com/paper/PMC12909097