Score: 0

Identifiability and improper solutions in the probabilistic partial least squares regression with unique variance

Published: December 5, 2025 | arXiv ID: 2512.05328v1

By: Takashi Arai

This paper addresses theoretical issues associated with probabilistic partial least squares (PLS) regression. As in the case of factor analysis, the probabilistic PLS regression with unique variance suffers from the issues of improper solutions and lack of identifiability, both of which causes difficulties in interpreting latent variables and model parameters. Using the fact that the probabilistic PLS regression can be viewed as a special case of factor analysis, we apply a norm constraint prescription on the factor loading matrix in the probabilistic PLS regression, which was recently proposed in the context of factor analysis to avoid improper solutions. Then, we prove that the probabilistic PLS regression with this norm constraint is identifiable. We apply the probabilistic PLS regression to data on amino acid mutations in Human Immunodeficiency Virus (HIV) protease to demonstrate the validity of the norm constraint and to confirm the identifiability numerically. Utilizing the proposed constraint enables the visualization of latent variables via a biplot. We also investigate the sampling distribution of the maximum likelihood estimates (MLE) using synthetically generated data. We numerically observe that MLE is consistent and asymptotically normally distributed.

Category
Statistics:
Methodology