Score: 0

LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring

Published: December 30, 2025 | arXiv ID: 2512.24235v1

By: May Bashendy , Walid Massoud , Sohaila Eltanbouly and more

Automated Essay Scoring (AES) has gained increasing attention in recent years, yet research on Arabic AES remains limited due to the lack of publicly available datasets. To address this, we introduce LAILA, the largest publicly available Arabic AES dataset to date, comprising 7,859 essays annotated with holistic and trait-specific scores on seven dimensions: relevance, organization, vocabulary, style, development, mechanics, and grammar. We detail the dataset design, collection, and annotations, and provide benchmark results using state-of-the-art Arabic and English models in prompt-specific and cross-prompt settings. LAILA fills a critical need in Arabic AES research, supporting the development of robust scoring systems.

Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection

Computation and Language

Teaches computers to grade Arabic essays better.

22 Mar 2025 2

89%

EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models

Computation and Language

Helps computers grade essays better, even with pictures.

17 Feb 2025 0

89%

How well can LLMs Grade Essays in Arabic?

Computation and Language

Helps computers grade Arabic essays better.

27 Jan 2025 1

View PDF Login to Bookmark

LAILA: A Large Trait-Based Dataset for Arabic Automated Essay Scoring

Technical Abstract

Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection

EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models

How well can LLMs Grade Essays in Arabic?