Score: 0

Prediction of Spotify Chart Success Using Audio and Streaming Features

Published: May 22, 2025 | arXiv ID: 2508.11632v1

By: Ian Jacob Cabansag, Paul Ntegeka

Potential Business Impact:

Predicts hit songs using music sound.

Spotify's streaming charts offer a real-time lens into music popularity, driving discovery, playlists, and even revenue potential. Understanding what influences a song's rise in ranks on these charts-especially early on-can guide marketing efforts, investment decisions, and even artistic direction. In this project, we developed a classification pipeline to predict a song's chart success based on its musical characteristics and early engagement data. Using all 2024 U.S. Top 200 Spotify Daily Charts and the Spotify Web API, we built a dataset containing both metadata and audio features for 14,639 unique songs. The project was structured in two phases. First, we benchmarked four models: Logistic Regression, K Nearest Neighbors, Random Forest, and XGBoost-using a standard train-test split. In the second phase, we incorporated cross-validation, hyperparameter tuning, and detailed class-level evaluation to ensure robustness. Tree-based models consistently outperformed the rest, with Random Forest and XGBoost achieving macro F1-scores near 0.95 and accuracy around 97%. Even when stream count and rank history were excluded, models trained solely on audio attributes retained predictive power. These findings validate the potential of audio-based modeling in A&R scouting, playlist optimization, and hit forecasting-long before a track reaches critical mass.

Page Count
6 pages

Category
Computer Science:
Sound