Kernel Density Balancing
John Park, Ning Hao, Yue Selena Niu, and Ming Hu

TL;DR
This paper introduces a statistically justified, kernel-based matrix balancing method for Hi-C data normalization, demonstrating improved consistency, convergence, and robustness over traditional approaches.
Contribution
It proposes a novel kernel-based estimator for matrix balancing in Hi-C data, with theoretical guarantees and practical advantages.
Findings
Kernel-based method is consistent under mild assumptions.
It converges faster than existing methods.
It is more robust to data sparsity.
Abstract
High-throughput chromatin conformation capture (Hi-C) data provide insights into the 3D structure of chromosomes, with normalization being a crucial pre-processing step. A common technique for normalization is matrix balancing, which rescales rows and columns of a Hi-C matrix to equalize their sums. Despite its popularity and convenience, matrix balancing lacks statistical justification. In this paper, we introduce a statistical model to analyze matrix balancing methods and propose a kernel-based estimator that leverages spatial structure. Under mild assumptions, we demonstrate that the kernel-based method is consistent, converges faster, and is more robust to data sparsity compared to existing approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods
