ODBAE: a high-performance model identifying complex phenotypes in high-dimensional biological datasets
Yafei Shen, Tao Zhang, Zhiwei Liu, Kalliopi Kostelidou, Ying Xu, Ling, Yang

TL;DR
ODBAE is a novel machine learning approach that detects complex, multi-parameter phenotypes in high-dimensional biological data by capturing latent relationships and identifying influential and high leverage outliers.
Contribution
This paper introduces ODBAE, a new autoencoder-based method with a revised loss function for detecting complex outliers in biological datasets, revealing joint abnormalities and novel gene associations.
Findings
ODBAE successfully identifies knockout mice with abnormal multi-indicator phenotypes.
The method uncovers new metabolism-related genes linked to complex phenotypes.
It reveals coordinated abnormalities across multiple metabolic indicators.
Abstract
Identifying complex phenotypes from high-dimensional biological data is challenging due to the intricate interdependencies among different physiological indicators. Traditional approaches often focus on detecting outliers in single variables, overlooking the broader network of interactions that contribute to phenotype emergence. Here, we introduce ODBAE (Outlier Detection using Balanced Autoencoders), a machine learning method designed to uncover both subtle and extreme outliers by capturing latent relationships among multiple physiological parameters. ODBAE's revised loss function enhances its ability to detect two key types of outliers: influential points (IP), which disrupt latent correlations between dimensions, and high leverage points (HLP), which deviate from the norm but go undetected by traditional autoencoder-based methods. Using data from the International Mouse Phenotyping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Fault Detection and Control Systems
