Nonparametric Factor Analysis and Beyond
By: Yujia Zheng , Yang Liu , Jiaxiong Yao and more
Potential Business Impact:
Finds hidden causes even with messy data.
Nearly all identifiability results in unsupervised representation learning inspired by, e.g., independent component analysis, factor analysis, and causal representation learning, rely on assumptions of additive independent noise or noiseless regimes. In contrast, we study the more general case where noise can take arbitrary forms, depend on latent variables, and be non-invertibly entangled within a nonlinear function. We propose a general framework for identifying latent variables in the nonparametric noisy settings. We first show that, under suitable conditions, the generative model is identifiable up to certain submanifold indeterminacies even in the presence of non-negligible noise. Furthermore, under the structural or distributional variability conditions, we prove that latent variables of the general nonlinear models are identifiable up to trivial indeterminacies. Based on the proposed theoretical framework, we have also developed corresponding estimation methods and validated them in various synthetic and real-world settings. Interestingly, our estimate of the true GDP growth from alternative measurements suggests more insightful information on the economies than official reports. We expect our framework to provide new insight into how both researchers and practitioners deal with latent variables in real-world scenarios.
Similar Papers
Bayesian analysis of nonlinear structured latent factor models using a Gaussian Process Prior
Methodology
Find hidden patterns in complex data.
Identifiability and Inference for Generalized Latent Factor Models
Methodology
Finds hidden patterns in data for better understanding.
Identifiability and Estimation in High-Dimensional Nonparametric Latent Structure Models
Statistics Theory
Find hidden patterns in complex data better.