Projected $t$-SNE for batch correction
Emanuele Aliverti, Jeff Tilson, Dayne Filer, Benjamin Babcock,, Alejandro Colaneri, Jennifer Ocasio, Timothy R. Gershon, Kirk C. Wilhelmsen, and David B. Dunson

TL;DR
This paper introduces a new method for removing batch effects from t-SNE embeddings in high-dimensional biomedical data, improving interpretability while preserving biological signals.
Contribution
The authors propose a novel linear algebra and constrained optimization-based procedure to effectively correct batch effects in t-SNE visualizations, with fast computation.
Findings
Successfully removes multiple batch effects from t-SNE embeddings
Retains fundamental biological information such as cell types
Effective in real single-cell gene expression datasets
Abstract
Biomedical research often produces high-dimensional data confounded by batch effects such as systematic experimental variations, different protocols and subject identifiers. Without proper correction, low-dimensional representation of high-dimensional data might encode and reproduce the same systematic variations observed in the original data, and compromise the interpretation of the results. In this article, we propose a novel procedure to remove batch effects from low-dimensional embeddings obtained with t-SNE dimensionality reduction. The proposed methods are based on linear algebra and constrained optimization, leading to efficient algorithms and fast computation in many high-dimensional settings. Results on artificial single-cell transcription profiling data show that the proposed procedure successfully removes multiple batch effects from t-SNE embeddings, while retaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
