Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification
By: Linh Nguyen , Chunhua Liu , Hong Yi Lin and more
Potential Business Impact:
Helps computers understand code feedback better.
Code review is a crucial practice in software development. As code review nowadays is lightweight, various issues can be identified, and sometimes, they can be trivial. Research has investigated automated approaches to classify review comments to gauge the effectiveness of code reviews. However, previous studies have primarily relied on supervised machine learning, which requires extensive manual annotation to train the models effectively. To address this limitation, we explore the potential of using Large Language Models (LLMs) to classify code review comments. We assess the performance of LLMs to classify 17 categories of code review comments. Our results show that LLMs can classify code review comments, outperforming the state-of-the-art approach using a trained deep learning model. In particular, LLMs achieve better accuracy in classifying the five most useful categories, which the state-of-the-art approach struggles with due to low training examples. Rather than relying solely on a specific small training data distribution, our results show that LLMs provide balanced performance across high- and low-frequency categories. These results suggest that the LLMs could offer a scalable solution for code review analytics to improve the effectiveness of the code review process.
Similar Papers
Operationalizing Large Language Models with Design-Aware Contexts for Code Comment Generation
Software Engineering
Helps computers write better explanations for code.
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks
Software Engineering
Makes computer code easier to understand and write.
Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New Models
Software Engineering
Sorts software problems faster, needing less data.