Prediction of Spotify Chart Success Using Audio and Streaming Features
By: Ian Jacob Cabansag, Paul Ntegeka
Potential Business Impact:
Predicts hit songs using music sound.
Spotify's streaming charts offer a real-time lens into music popularity, driving discovery, playlists, and even revenue potential. Understanding what influences a song's rise in ranks on these charts-especially early on-can guide marketing efforts, investment decisions, and even artistic direction. In this project, we developed a classification pipeline to predict a song's chart success based on its musical characteristics and early engagement data. Using all 2024 U.S. Top 200 Spotify Daily Charts and the Spotify Web API, we built a dataset containing both metadata and audio features for 14,639 unique songs. The project was structured in two phases. First, we benchmarked four models: Logistic Regression, K Nearest Neighbors, Random Forest, and XGBoost-using a standard train-test split. In the second phase, we incorporated cross-validation, hyperparameter tuning, and detailed class-level evaluation to ensure robustness. Tree-based models consistently outperformed the rest, with Random Forest and XGBoost achieving macro F1-scores near 0.95 and accuracy around 97%. Even when stream count and rank history were excluded, models trained solely on audio attributes retained predictive power. These findings validate the potential of audio-based modeling in A&R scouting, playlist optimization, and hit forecasting-long before a track reaches critical mass.
Similar Papers
Predicting Music Track Popularity by Convolutional Neural Networks on Spotify Features and Spectrogram of Audio Waveform
Sound
Predicts which songs will be hits.
Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction
Sound
Predicts song hits using lyrics and sound.
User-centric Music Recommendations
Information Retrieval
Suggests songs you'll like right now.