TL;DR
This paper introduces a deep learning residual network method that effectively removes systematic batch effects from biological datasets like mass cytometry and single-cell RNA-seq by minimizing distribution discrepancies.
Contribution
A novel residual network approach that uses MMD minimization to correct batch effects in high-dimensional biological data.
Findings
Effectively reduces batch effects in mass cytometry data.
Successfully attenuates batch effects in single-cell RNA-seq datasets.
Outperforms existing batch correction methods.
Abstract
Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument, and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq, are plagued with systematic errors that may severely affect statistical analysis if the data is not properly calibrated. We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual network, trained to minimize the Maximum Mean Discrepancy (MMD) between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and single-cell RNA-seq datasets, and demonstrate that it effectively attenuates batch effects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
