Prediction of Coffee Ratings Based On Influential Attributes Using SelectKBest and Optimal Hyperparameters
By: Edmund Agyemang , Lawrence Agbota , Vincent Agbenyeavu and more
Potential Business Impact:
Predicts coffee taste from reviews.
This study explores the application of supervised machine learning algorithms to predict coffee ratings based on a combination of influential textual and numerical attributes extracted from user reviews. Through careful data preprocessing including text cleaning, feature extraction using TF-IDF, and selection with SelectKBest, the study identifies key factors contributing to coffee quality assessments. Six models (Decision Tree, KNearest Neighbors, Multi-layer Perceptron, Random Forest, Extra Trees, and XGBoost) were trained and evaluated using optimized hyperparameters. Model performance was assessed primarily using F1-score, Gmean, and AUC metrics. Results demonstrate that ensemble methods (Extra Trees, Random Forest, and XGBoost), as well as Multi-layer Perceptron, consistently outperform simpler classifiers (Decision Trees and K-Nearest Neighbors) in terms of evaluation metrics such as F1 scores, G-mean and AUC. The findings highlight the essence of rigorous feature selection and hyperparameter tuning in building robust predictive systems for sensory product evaluation, offering a data driven approach to complement traditional coffee cupping by expertise of trained professionals.
Similar Papers
Wine Quality Prediction with Ensemble Trees: A Unified, Leak-Free Comparative Study
Machine Learning (CS)
Helps computers judge wine quality like experts.
The Role of Hyperparameters in Predictive Multiplicity
Machine Learning (CS)
Makes computer predictions more consistent.
A comparative analysis of machine learning algorithms for predicting probabilities of default
Risk Management
Helps banks guess if people will repay loans.