Advanced spectral clustering for heterogeneous data in credit risk monitoring systems
By: Lu Han , Mengyan Li , Jiping Qiang and more
Potential Business Impact:
Finds risky companies using money and words.
Heterogeneous data, which encompass both numerical financial variables and textual records, present substantial challenges for credit monitoring. To address this issue, we propose Advanced Spectral Clustering (ASC), a method that integrates financial and textual similarities through an optimized weight parameter and selects eigenvectors using a novel eigenvalue-silhouette optimization approach. Evaluated on a dataset comprising 1,428 small and medium-sized enterprises (SMEs), ASC achieves a Silhouette score that is 18% higher than that of a single-type data baseline method. Furthermore, the resulting clusters offer actionable insights; for instance, 51% of low-risk firms are found to include the term 'social recruitment' in their textual records. The robustness of ASC is confirmed across multiple clustering algorithms, including k-means, k-medians, and k-medoids, with {\Delta}Intra/Inter < 0.13 and {\Delta}Silhouette Coefficient < 0.02. By bridging spectral clustering theory with heterogeneous data applications, ASC enables the identification of meaningful clusters, such as recruitment-focused SMEs exhibiting a 30% lower default risk, thereby supporting more targeted and effective credit interventions.
Similar Papers
Scalable Parameter-Light Spectral Method for Clustering Short Text Embeddings with a Cohesion-Based Evaluation Metric
Machine Learning (CS)
Finds hidden groups in text without guessing.
Scalable Parameter-Light Spectral Method for Clustering Short Text Embeddings with a Cohesion-Based Evaluation Metric
Machine Learning (CS)
Finds hidden groups in text without guessing.
An Improved and Generalised Analysis for Spectral Clustering
Machine Learning (CS)
Finds hidden groups in connected information.