Score: 0

Advancing Text Classification with Large Language Models and Neural Attention Mechanisms

Published: December 10, 2025 | arXiv ID: 2512.09444v1

By: Ning Lyu , Yuxi Wang , Feng Chen and more

Potential Business Impact:

Helps computers understand and sort text better.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This study proposes a text classification algorithm based on large language models, aiming to address the limitations of traditional methods in capturing long-range dependencies, understanding contextual semantics, and handling class imbalance. The framework includes text encoding, contextual representation modeling, attention-based enhancement, feature aggregation, and classification prediction. In the representation stage, deep semantic embeddings are obtained through large-scale pretrained language models, and attention mechanisms are applied to enhance the selective representation of key features. In the aggregation stage, global and weighted strategies are combined to generate robust text-level vectors. In the classification stage, a fully connected layer and Softmax output are used to predict class distributions, and cross-entropy loss is employed to optimize model parameters. Comparative experiments introduce multiple baseline models, including recurrent neural networks, graph neural networks, and Transformers, and evaluate them on Precision, Recall, F1-Score, and AUC. Results show that the proposed method outperforms existing models on all metrics, with especially strong improvements in Recall and AUC. In addition, sensitivity experiments are conducted on hyperparameters and data conditions, covering the impact of hidden dimensions on AUC and the impact of class imbalance ratios on Recall. The findings demonstrate that proper model configuration has a significant effect on performance and reveal the adaptability and stability of the model under different conditions. Overall, the proposed text classification method not only achieves effective performance improvement but also verifies its robustness and applicability in complex data environments through systematic analysis.

Page Count
5 pages

Category
Computer Science:
Computation and Language