Score: 0

GHaLIB: A Multilingual Framework for Hope Speech Detection in Low-Resource Languages

Published: December 27, 2025 | arXiv ID: 2512.22705v1

By: Ahmed Abdullah, Sana Fatima, Haroon Mahmood

Potential Business Impact:

Finds hopeful messages in any language online.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Hope speech has been relatively underrepresented in Natural Language Processing (NLP). Current studies are largely focused on English, which has resulted in a lack of resources for low-resource languages such as Urdu. As a result, the creation of tools that facilitate positive online communication remains limited. Although transformer-based architectures have proven to be effective in detecting hate and offensive speech, little has been done to apply them to hope speech or, more generally, to test them across a variety of linguistic settings. This paper presents a multilingual framework for hope speech detection with a focus on Urdu. Using pretrained transformer models such as XLM-RoBERTa, mBERT, EuroBERT, and UrduBERT, we apply simple preprocessing and train classifiers for improved results. Evaluations on the PolyHope-M 2025 benchmark demonstrate strong performance, achieving F1-scores of 95.2% for Urdu binary classification and 65.2% for Urdu multi-class classification, with similarly competitive results in Spanish, German, and English. These results highlight the possibility of implementing existing multilingual models in low-resource environments, thus making it easier to identify hope speech and helping to build a more constructive digital discourse.

Country of Origin
πŸ‡΅πŸ‡° Pakistan

Page Count
7 pages

Category
Computer Science:
Computation and Language