Score: 0

How Do Community Smells Influence Self-Admitted Technical Debt in Machine Learning Projects?

Published: June 18, 2025 | arXiv ID: 2506.15884v2

By: Shamse Tasnim Cynthia, Nuri Almarimi, Banani Roy

Potential Business Impact:

Fixes messy code by spotting bad team habits.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

Community smells reflect poor organizational practices that often lead to socio-technical issues and the accumulation of Self-Admitted Technical Debt (SATD). While prior studies have explored these problems in general software systems, their interplay in machine learning (ML)-based projects remains largely underexamined. In this study, we investigated the prevalence of community smells and their relationship with SATD in open-source ML projects, analyzing data at the release level. First, we examined the prevalence of ten community smell types across the releases of 155 ML-based systems and found that community smells are widespread, exhibiting distinct distribution patterns across small, medium, and large projects. Second, we detected SATD at the release level and applied statistical analysis to examine its correlation with community smells. Our results showed that certain smells, such as Radio Silence and Organizational Silos, are strongly correlated with higher SATD occurrences. Third, we considered the six identified types of SATD to determine which community smells are most associated with each debt category. Our analysis revealed authority- and communication-related smells often co-occur with persistent code and design debt. Finally, we analyzed how the community smells and SATD evolve over the releases, uncovering project size-dependent trends and shared trajectories. Our findings emphasize the importance of early detection and mitigation of socio-technical issues to maintain the long-term quality and sustainability of ML-based systems.

Country of Origin
🇨🇦 Canada

Page Count
12 pages

Category
Computer Science:
Software Engineering