Kernel Density Balancing
By: John Park , Ning Hao , Yue Selena Niu and more
Potential Business Impact:
Improves how we see inside cells.
High-throughput chromatin conformation capture (Hi-C) data provide insights into the 3D structure of chromosomes, with normalization being a crucial pre-processing step. A common technique for normalization is matrix balancing, which rescales rows and columns of a Hi-C matrix to equalize their sums. Despite its popularity and convenience, matrix balancing lacks statistical justification. In this paper, we introduce a statistical model to analyze matrix balancing methods and propose a kernel-based estimator that leverages spatial structure. Under mild assumptions, we demonstrate that the kernel-based method is consistent, converges faster, and is more robust to data sparsity compared to existing approaches.
Similar Papers
A tree-based kernel for densities and its applications in clustering DNase-seq profiles
Methodology
Finds DNA patterns to understand how genes turn on.
Sensitivity-Aware Density Estimation in Multiple Dimensions
Machine Learning (CS)
Makes computers better at guessing patterns in messy data.
From Local Updates to Global Balance: A Framework for Distributed Matrix Scaling
Optimization and Control
Helps computers learn from local information.