LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback
By: Weiyue Li , Mingxiao Song , Zhenda Shen and more
Potential Business Impact:
Helps AI write more creative stories.
Large Language Models (LLMs) often struggle with creative generation, and multi-agent frameworks that improve reasoning through interaction can paradoxically hinder creativity by inducing content homogenization. We introduce LLM Review, a peer-review-inspired framework implementing Blind Peer Review: agents exchange targeted feedback while revising independently, preserving divergent creative trajectories. To enable rigorous evaluation, we propose SciFi-100, a science fiction writing dataset with a unified framework combining LLM-as-a-judge scoring, human annotation, and rule-based novelty metrics. Experiments demonstrate that LLM Review consistently outperforms multi-agent baselines, and smaller models with our framework can surpass larger single-agent models, suggesting interaction structure may substitute for model scale.
Similar Papers
LLM-REVal: Can We Trust LLM Reviewers Yet?
Computation and Language
AI reviewers unfairly favor AI-written papers.
Can LLM feedback enhance review quality? A randomized study of 20K reviews at ICLR 2025
Artificial Intelligence
Helps AI paper reviewers write better feedback.
Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process
Computation and Language
Chooses the best AI answer from many.