Tensor Neyman-Pearson Classification: Theory, Algorithms, and Error Control
By: Lingchong Liu , Elynn Chen , Yuefeng Han and more
Potential Business Impact:
Finds dangerous chemicals more reliably.
Biochemical discovery increasingly relies on classifying molecular structures when the consequences of different errors are highly asymmetric. In mutagenicity and carcinogenicity, misclassifying a harmful compound as benign can trigger substantial scientific, regulatory, and health risks, whereas false alarms primarily increase laboratory workload. Modern representations transform molecular graphs into persistence image tensors that preserve multiscale geometric and topological structure, yet existing tensor classifiers and deep tensor neural networks provide no finite-sample guarantees on type I error and often exhibit severe error inflation in practice. We develop the first Tensor Neyman-Pearson (Tensor-NP) classification framework that achieves finite-sample control of type I error while exploiting the multi-mode structure of tensor data. Under a tensor-normal mixture model, we derive the oracle NP discriminant, characterize its Tucker low-rank manifold geometry, and establish tensor-specific margin and conditional detection conditions enabling high-probability bounds on excess type II error. We further propose a Discriminant Tensor Iterative Projection estimator and a Tensor-NP Neural Classifier combining deep learning with Tensor-NP umbrella calibration, yielding the first distribution-free NP-valid methods for multiway data. Across four biochemical datasets, Tensor-NP classifiers maintain type I errors at prespecified levels while delivering competitive type II error performance, providing reliable tools for asymmetric-risk decisions with complex molecular tensors.
Similar Papers
Geometric Calibration and Neutral Zones for Uncertainty-Aware Multi-Class Classification
Machine Learning (Stat)
Helps computers know when they are unsure.
Geometric Calibration and Neutral Zones for Uncertainty-Aware Multi-Class Classification
Machine Learning (Stat)
Makes AI know when it's unsure.
A Fully Probabilistic Tensor Network for Regularized Volterra System Identification
Machine Learning (Stat)
Makes complex computer models simpler and smarter.