Projected $t$-SNE for batch correction

Emanuele Aliverti; Jeff Tilson; Dayne Filer; Benjamin Babcock,; Alejandro Colaneri; Jennifer Ocasio; Timothy R. Gershon; Kirk C. Wilhelmsen; and David B. Dunson

arXiv:1911.06708·stat.AP·April 6, 2020·Bioinform.

Projected $t$-SNE for batch correction

Emanuele Aliverti, Jeff Tilson, Dayne Filer, Benjamin Babcock,, Alejandro Colaneri, Jennifer Ocasio, Timothy R. Gershon, Kirk C. Wilhelmsen, and David B. Dunson

PDF

TL;DR

This paper introduces a new method for removing batch effects from t-SNE embeddings in high-dimensional biomedical data, improving interpretability while preserving biological signals.

Contribution

The authors propose a novel linear algebra and constrained optimization-based procedure to effectively correct batch effects in t-SNE visualizations, with fast computation.

Findings

01

Successfully removes multiple batch effects from t-SNE embeddings

02

Retains fundamental biological information such as cell types

03

Effective in real single-cell gene expression datasets

Abstract

Biomedical research often produces high-dimensional data confounded by batch effects such as systematic experimental variations, different protocols and subject identifiers. Without proper correction, low-dimensional representation of high-dimensional data might encode and reproduce the same systematic variations observed in the original data, and compromise the interpretation of the results. In this article, we propose a novel procedure to remove batch effects from low-dimensional embeddings obtained with t-SNE dimensionality reduction. The proposed methods are based on linear algebra and constrained optimization, leading to efficient algorithms and fast computation in many high-dimensional settings. Results on artificial single-cell transcription profiling data show that the proposed procedure successfully removes multiple batch effects from t-SNE embeddings, while retaining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.