Pretrained-Guided Conditional Diffusion Models for Microbiome Data Analysis
Xinyuan Shi, Fangfang Zhu, Wenwen Min

TL;DR
This paper introduces mbVDiT, a pre-trained conditional diffusion model that leverages patient metadata and variational autoencoders to improve microbiome data imputation and denoising, especially in cancer research.
Contribution
The paper presents a novel pre-trained conditional diffusion model that incorporates clinical metadata and VAE for enhanced microbiome data imputation and denoising.
Findings
Outperforms existing microbiome data imputation methods.
Effectively utilizes patient metadata for improved accuracy.
Demonstrates robustness across multiple cancer datasets.
Abstract
Emerging evidence indicates that human cancers are intricately linked to human microbiomes, forming an inseparable connection. However, due to limited sample sizes and significant data loss during collection for various reasons, some machine learning methods have been proposed to address the issue of missing data. These methods have not fully utilized the known clinical information of patients to enhance the accuracy of data imputation. Therefore, we introduce mbVDiT, a novel pre-trained conditional diffusion model for microbiome data imputation and denoising, which uses the unmasked data and patient metadata as conditional guidance for imputating missing values. It is also uses VAE to integrate the the other public microbiome datasets to enhance model performance. The results on the microbiome datasets from three different cancer types demonstrate the performance of our methods in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
MethodsDiffusion
