Evaluating the Creativity of LLMs in Persian Literary Text Generation
By: Armin Tourajmehr, Mohammad Reza Modarres, Yadollah Yaghoobzadeh
Potential Business Impact:
Computers write creative Persian stories.
Large language models (LLMs) have demonstrated notable creative abilities in generating literary texts, including poetry and short stories. However, prior research has primarily centered on English, with limited exploration of non-English literary traditions and without standardized methods for assessing creativity. In this paper, we evaluate the capacity of LLMs to generate Persian literary text enriched with culturally relevant expressions. We build a dataset of user-generated Persian literary spanning 20 diverse topics and assess model outputs along four creativity dimensions-originality, fluency, flexibility, and elaboration-by adapting the Torrance Tests of Creative Thinking. To reduce evaluation costs, we adopt an LLM as a judge for automated scoring and validate its reliability against human judgments using intraclass correlation coefficients, observing strong agreement. In addition, we analyze the models' ability to understand and employ four core literary devices: simile, metaphor, hyperbole, and antithesis. Our results highlight both the strengths and limitations of LLMs in Persian literary text generation, underscoring the need for further refinement.
Similar Papers
Capabilities and Evaluation Biases of Large Language Models in Classical Chinese Poetry Generation: A Case Study on Tang Poetry
Computation and Language
Computers write poems, but humans must check them.
Style Over Story: A Process-Oriented Study of Authorial Creativity in Large Language Models
Computation and Language
AI writing tools prefer style over story.
MELAC: Massive Evaluation of Large Language Models with Alignment of Culture in Persian Language
Computation and Language
Helps computers understand Persian language and culture better.