Score: 0

Prediction of Coffee Ratings Based On Influential Attributes Using SelectKBest and Optimal Hyperparameters

Published: September 10, 2025 | arXiv ID: 2509.18124v1

By: Edmund Agyemang , Lawrence Agbota , Vincent Agbenyeavu and more

Potential Business Impact:

Predicts coffee taste from reviews.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

This study explores the application of supervised machine learning algorithms to predict coffee ratings based on a combination of influential textual and numerical attributes extracted from user reviews. Through careful data preprocessing including text cleaning, feature extraction using TF-IDF, and selection with SelectKBest, the study identifies key factors contributing to coffee quality assessments. Six models (Decision Tree, KNearest Neighbors, Multi-layer Perceptron, Random Forest, Extra Trees, and XGBoost) were trained and evaluated using optimized hyperparameters. Model performance was assessed primarily using F1-score, Gmean, and AUC metrics. Results demonstrate that ensemble methods (Extra Trees, Random Forest, and XGBoost), as well as Multi-layer Perceptron, consistently outperform simpler classifiers (Decision Trees and K-Nearest Neighbors) in terms of evaluation metrics such as F1 scores, G-mean and AUC. The findings highlight the essence of rigorous feature selection and hyperparameter tuning in building robust predictive systems for sensory product evaluation, offering a data driven approach to complement traditional coffee cupping by expertise of trained professionals.

Page Count
13 pages

Category
Computer Science:
Machine Learning (CS)