Automatic Classification of User Requirements from Online Feedback -- A Replication Study
By: Meet Bhatt , Nic Boilard , Muhammad Rehan Chaudhary and more
Potential Business Impact:
Helps computers understand user feedback better.
Natural language processing (NLP) techniques have been widely applied in the requirements engineering (RE) field to support tasks such as classification and ambiguity detection. Although RE research is rooted in empirical investigation, it has paid limited attention to replicating NLP for RE (NLP4RE) studies. The rapidly advancing realm of NLP is creating new opportunities for efficient, machine-assisted workflows, which can bring new perspectives and results to the forefront. Thus, we replicate and extend a previous NLP4RE study (baseline), "Classifying User Requirements from Online Feedback in Small Dataset Environments using Deep Learning", which evaluated different deep learning models for requirement classification from user reviews. We reproduced the original results using publicly released source code, thereby helping to strengthen the external validity of the baseline study. We then extended the setup by evaluating model performance on an external dataset and comparing results to a GPT-4o zero-shot classifier. Furthermore, we prepared the replication study ID-card for the baseline study, important for evaluating replication readiness. Results showed diverse reproducibility levels across different models, with Naive Bayes demonstrating perfect reproducibility. In contrast, BERT and other models showed mixed results. Our findings revealed that baseline deep learning models, BERT and ELMo, exhibited good generalization capabilities on an external dataset, and GPT-4o showed performance comparable to traditional baseline machine learning models. Additionally, our assessment confirmed the baseline study's replication readiness; however missing environment setup files would have further enhanced readiness. We include this missing information in our replication package and provide the replication study ID-card for our study to further encourage and support the replication of our study.
Similar Papers
From Online User Feedback to Requirements: Evaluating Large Language Models for Classification and Specification Tasks
Software Engineering
Helps apps understand what users want better.
Large Language Models (LLMs) for Requirements Engineering (RE): A Systematic Literature Review
Software Engineering
Helps computers write better software plans.
From Machine Learning Documentation to Requirements: Bridging Processes with Requirements Languages
Software Engineering
Helps build better AI by using its notes.