Capturing Classic Authorial Style in Long-Form Story Generation with GRPO Fine-Tuning
By: Jinlong Liu , Mohammed Bahja , Venelin Kovatchev and more
Potential Business Impact:
Writes stories that sound like a famous author.
Recent advances in large language models (LLMs) show impressive performance in open-ended story generation, but fine-grained stylistic control remains limited. Existing methods often rely on shallow cues (e.g., names or topics) to simulate authorial style, without robust evaluation. In this work, we present a training framework for style-conditioned story generation using Group Relative Policy Optimization (GRPO) and a custom multi-reward setup. The style reward is derived from a fine-tuned sentence transformer using authorship verification (AV) signals, combined with content and completeness scores to stabilize long-form narrative generation. We conduct experiments using fiction by Mark Twain, a prominent 19th-century American author, with The Adventures of Huckleberry Finn serving as the reference style exemplar. Our 8B model outperforms larger baselines such as GPT-4o and Claude Sonnet 4 in AV-style metrics, achieving a style score of 0.628 and competitive content quality. Results demonstrate the feasibility of agentic stylistic generation with moderate model size and task-specific training. While the output is clearly style-aligned, narrative completeness remains a challenge, indicating future work is needed to better model global coherence and story resolution.
Similar Papers
Multi-Reward GRPO for Stable and Prosodic Single-Codebook TTS LLMs at Scale
Sound
Makes computer voices sound more natural and human.
Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts
Computation and Language
AI writes like famous old authors.
Long Story Generation via Knowledge Graph and Literary Theory
Computation and Language
Writes longer, more interesting stories that don't get boring.