EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English and Arabic
By: Wajdi Zaghouani, Md. Rafiul Biswas
Potential Business Impact:
Helps computers understand feelings in Arabic and English.
This research introduces a bilingual dataset comprising 23,456 entries for Arabic and 10,036 entries for English, annotated for emotions and hope speech, addressing the scarcity of multi-emotion (Emotion and hope) datasets. The dataset provides comprehensive annotations capturing emotion intensity, complexity, and causes, alongside detailed classifications and subcategories for hope speech. To ensure annotation reliability, Fleiss' Kappa was employed, revealing 0.75-0.85 agreement among annotators both for Arabic and English language. The evaluation metrics (micro-F1-Score=0.67) obtained from the baseline model (i.e., using a machine learning model) validate that the data annotations are worthy. This dataset offers a valuable resource for advancing natural language processing in underrepresented languages, fostering better cross-linguistic analysis of emotions and hope speech.
Similar Papers
Detecting Hope, Hate, and Emotion in Arabic Textual Speech and Multi-modal Memes Using Large Language Models
Computation and Language
Finds hate speech and feelings in Arabic posts.
Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English
Computation and Language
Helps computers understand different kinds of hope.
EmoTale: An Enacted Speech-emotion Dataset in Danish
Computation and Language
Helps computers understand Danish emotions in speech.