Score: 1

Leveraging Large Language Models for Classifying App Users' Feedback

Published: July 11, 2025 | arXiv ID: 2507.08250v1

By: Yasaman Abedini, Abbas Heydarnoori

Potential Business Impact:

Helps apps understand user comments better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

In recent years, significant research has been conducted into classifying application (app) user feedback, primarily relying on supervised machine learning algorithms. However, fine-tuning more generalizable classifiers based on existing labeled datasets remains an important challenge, as creating large and accurately labeled datasets often requires considerable time and resources. In this paper, we evaluate the capabilities of four advanced LLMs, including GPT-3.5-Turbo, GPT-4, Flan-T5, and Llama3-70b, to enhance user feedback classification and address the challenge of the limited labeled dataset. To achieve this, we conduct several experiments on eight datasets that have been meticulously labeled in prior research. These datasets include user reviews from app stores, posts from the X platform, and discussions from the public forums, widely recognized as representative sources of app user feedback. We analyze the performance of various LLMs in identifying both fine-grained and coarse-grained user feedback categories. Given the substantial volume of daily user feedback and the computational limitations of LLMs, we leverage these models as an annotation tool to augment labeled datasets with general and app-specific data. This augmentation aims to enhance the performance of state-of-the-art BERT-based classification models. Our findings indicate that LLMs when guided by well-crafted prompts, can effectively classify user feedback into coarse-grained categories. Moreover, augmenting the training dataset with datasets labeled using the consensus of LLMs can significantly enhance classifier performance.

Large Language Models For Text Classification: Case Study And Comprehensive Review

Computation and Language

Computers sort information better than before.

14 Jan 2025 1

90%

Assessing Large Language Models for Automated Feedback Generation in Learning Programming Problem Solving

Software Engineering

AI helps teachers grade student code better.

18 Mar 2025 0

90%

Augmenting Human-Annotated Training Data with Large Language Model Generation and Distillation in Open-Response Assessment

Computation and Language

Makes computers sort text better using AI and people.

15 Jan 2025 1

View PDF Login to Bookmark

Page Count

16 pages

Leveraging Large Language Models for Classifying App Users' Feedback

Helps apps understand user comments better.

Technical Abstract

Large Language Models For Text Classification: Case Study And Comprehensive Review

Assessing Large Language Models for Automated Feedback Generation in Learning Programming Problem Solving

Augmenting Human-Annotated Training Data with Large Language Model Generation and Distillation in Open-Response Assessment