Score: 0

Advanced spectral clustering for heterogeneous data in credit risk monitoring systems

Published: August 30, 2025 | arXiv ID: 2509.00546v1

By: Lu Han , Mengyan Li , Jiping Qiang and more

Potential Business Impact:

Finds risky companies using money and words.

Business Areas:
Text Analytics Data and Analytics, Software

Heterogeneous data, which encompass both numerical financial variables and textual records, present substantial challenges for credit monitoring. To address this issue, we propose Advanced Spectral Clustering (ASC), a method that integrates financial and textual similarities through an optimized weight parameter and selects eigenvectors using a novel eigenvalue-silhouette optimization approach. Evaluated on a dataset comprising 1,428 small and medium-sized enterprises (SMEs), ASC achieves a Silhouette score that is 18% higher than that of a single-type data baseline method. Furthermore, the resulting clusters offer actionable insights; for instance, 51% of low-risk firms are found to include the term 'social recruitment' in their textual records. The robustness of ASC is confirmed across multiple clustering algorithms, including k-means, k-medians, and k-medoids, with {\Delta}Intra/Inter < 0.13 and {\Delta}Silhouette Coefficient < 0.02. By bridging spectral clustering theory with heterogeneous data applications, ASC enables the identification of meaningful clusters, such as recruitment-focused SMEs exhibiting a 30% lower default risk, thereby supporting more targeted and effective credit interventions.

Country of Origin
🇨🇳 China

Page Count
25 pages

Category
Computer Science:
Machine Learning (CS)