Doubly-Stochastic Normalization of the Gaussian Kernel is Robust to Heteroskedastic Noise
Boris Landa, Ronald R.Coifman, Yuval Kluger

TL;DR
This paper shows that doubly-stochastic normalization of the Gaussian kernel is robust to heteroskedastic noise, automatically accounting for varying noise variances in high-dimensional data, outperforming other normalization methods.
Contribution
It proves the convergence of doubly-stochastic normalized affinity matrices under heteroskedastic noise and demonstrates its advantages through numerical and real data examples.
Findings
Doubly-stochastic normalization converges to the clean affinity matrix at rate m^{-1/2}.
It outperforms row-stochastic and symmetric normalizations under heteroskedastic noise.
The method is effective in analyzing single-cell RNA sequencing data with heteroskedasticity.
Abstract
A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We demonstrate that the doubly-stochastic normalization of the Gaussian kernel with zero main diagonal (i.e., no self loops) is robust to heteroskedastic noise. That is, the doubly-stochastic normalization is advantageous in that it automatically accounts for observations with different noise variances. Specifically, we prove that in a suitable high-dimensional setting where heteroskedastic noise does not concentrate too much in any particular direction in space, the resulting (doubly-stochastic)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
