Score: 0

Unsupervised Learning in a General Semiparametric Clusterwise Index Distribution Model

Published: September 29, 2025 | arXiv ID: 2509.24987v1

By: Jen-Chieh Teng, Chin-Tsang Chiang

Potential Business Impact:

Finds hidden groups to explain data patterns.

Business Areas:

A/B Testing Data and Analytics

This study introduces a general semiparametric clusterwise index distribution model to analyze how latent clusters affect the covariate-response relationships. By employing sufficient dimension reduction to account for the effects of covariates on the cluster variable, we develop a distinct method for estimating model parameters. Building on a subjectwise representation of the underlying model, the proposed separation penalty estimation method partitions individuals and estimates cluster index coefficients. We propose a convergent algorithm for this estimation procedure and incorporate a heuristic initialization to expedite optimization. The resulting partition estimator is subsequently used to fit the cluster membership model and to construct an optimal classification rule, with both procedures iteratively updating the partition and parameter estimators. Another key contribution of our method is the development of two consistent semiparametric information criteria for selecting the number of clusters. In line with principles of classification and estimation in supervised learning, the estimated cluster structure is consistent and optimal, and the parameter estimators possess the oracle property. Comprehensive simulation studies and empirical data analyses illustrate the effectiveness of the proposed methodology.

A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems

Optimization and Control

Finds groups in data without knowing how many.

6 Mar 2025 1

87%

Estimation in linear models with clustered data

Econometrics

Helps understand how groups of people affect each other.

18 Aug 2025 2

86%

Robust Inference Methods for Latent Group Panel Models under Possible Group Non-Separation

Econometrics

Finds hidden patterns in data to make better predictions.

23 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇹🇼 Taiwan, Province of China

Page Count

66 pages

Unsupervised Learning in a General Semiparametric Clusterwise Index Distribution Model

Finds hidden groups to explain data patterns.

Technical Abstract

A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems

Estimation in linear models with clustered data

Robust Inference Methods for Latent Group Panel Models under Possible Group Non-Separation