Algerian Dialect
By: Zakaria Benmounah, Abdennour Boulesnane
Potential Business Impact:
Helps computers understand feelings in Algerian videos.
We present Algerian Dialect, a large-scale sentiment-annotated dataset consisting of 45,000 YouTube comments written in Algerian Arabic dialect. The comments were collected from more than 30 Algerian press and media channels using the YouTube Data API. Each comment is manually annotated into one of five sentiment categories: very negative, negative, neutral, positive, and very positive. In addition to sentiment labels, the dataset includes rich metadata such as collection timestamps, like counts, video URLs, and annotation dates. This dataset addresses the scarcity of publicly available resources for Algerian dialect and aims to support research in sentiment analysis, dialectal Arabic NLP, and social media analytics. The dataset is publicly available on Mendeley Data under a CC BY 4.0 license at https://doi.org/10.17632/zzwg3nnhsz.2.
Similar Papers
AHaSIS: Shared Task on Sentiment Analysis for Arabic Dialects
Computation and Language
Helps understand customer feelings in Arabic hotel reviews.
MAPROC at AHaSIS Shared Task: Few-Shot and Sentence Transformer for Sentiment Analysis of Arabic Hotel Reviews
Computation and Language
Helps computers understand feelings in Arabic reviews.
HausaMovieReview: A Benchmark Dataset for Sentiment Analysis in Low-Resource African Language
Computation and Language
Helps computers understand a rare language's feelings.