Linearized Optimal Transport for Analysis of High-Dimensional Point-Cloud and Single-Cell Data
By: Tianxiang Wang , Yingtong Ke , Dhananjay Bhaskar and more
Potential Business Impact:
Compares cell data to understand sickness and treatments.
Single-cell technologies generate high-dimensional point clouds of cells, enabling detailed characterization of complex patient states and treatment responses. Yet each patient is represented by an irregular point cloud rather than a simple vector, making it difficult to directly quantify and compare biological differences between individuals. Nonlinear methods such as kernels and neural networks achieve predictive accuracy but act as black boxes, offering little biological interpretability. To address these limitations, we adapt the Linear Optimal Transport (LOT) framework to this setting, embedding irregular point clouds into a fixed-dimensional Euclidean space while preserving distributional structure. This embedding provides a principled linear representation that preserves optimal transport geometry while enabling downstream analysis. It also forms a registration between any two patients, enabling direct comparison of their cellular distributions. Within this space, LOT enables: (i) \textbf{accurate and interpretable classification} of COVID-19 patient states, where classifier weights map back to specific markers and spatial regions driving predictions; and (ii) \textbf{synthetic data generation} for patient-derived organoids, exploiting the linearity of the LOT embedding. LOT barycenters yield averaged cellular profiles representing combined conditions or samples, supporting drug interaction testing. Together, these results establish LOT as a unified framework that bridges predictive performance, interpretability, and generative modeling. By transforming heterogeneous point clouds into structured embeddings directly traceable to the original data, LOT opens new opportunities for understanding immune variation and treatment effects in high-dimensional biological systems.
Similar Papers
Linearized Optimal Transport for Analysis of High-Dimensional Point-Cloud and Single-Cell Data
Machine Learning (CS)
Compares cell data to understand sickness and treatments.
Linearized Optimal Transport pyLOT Library: A Toolkit for Machine Learning on Point Clouds
Machine Learning (Stat)
Helps computers learn from shapes by making them simpler.
Convex relaxation approaches for high dimensional optimal transport
Optimization and Control
Makes complex math problems easier for computers.