DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation
By: Shijian Ma, Yunqi Huang, Yan Lin
Potential Business Impact:
Helps AI write better, more believable movie scripts.
Drama script continuation requires models to maintain character consistency, advance plot coherently, and preserve dramatic structurecapabilities that existing benchmarks fail to evaluate comprehensively. We present DramaBench, the first large-scale benchmark for evaluating drama script continuation across six independent dimensions: Format Standards, Narrative Efficiency, Character Consistency, Emotional Depth, Logic Consistency, and Conflict Handling. Our framework combines rulebased analysis with LLM-based labeling and statistical metrics, ensuring objective and reproducible evaluation. We conduct comprehensive evaluation of 8 state-of-the-art language models on 1,103 scripts (8,824 evaluations total), with rigorous statistical significance testing (252 pairwise comparisons, 65.9% significant) and human validation (188 scripts, substantial agreement on 3/5 dimensions). Our ablation studies confirm all six dimensions capture independent quality aspects (mean | r | = 0.020). DramaBench provides actionable, dimensionspecific feedback for model improvement and establishes a rigorous standard for creative writing evaluation.
Similar Papers
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
CV and Pattern Recognition
Helps computers understand movie plot twists.
Speech-DRAME: A Framework for Human-Aligned Benchmarks in Speech Role-Play
Sound
Makes AI better at acting like people in conversations.
Three Stage Narrative Analysis; Plot-Sentiment Breakdown, Structure Learning and Concept Detection
Computation and Language
Helps pick movies by understanding story feelings.