Once Upon a Time: Interactive Learning for Storytelling with Small Language Models
By: Jonas Mayer Martins , Ali Hamza Bashir , Muhammad Rehan Khalid and more
Potential Business Impact:
Teaches computers to write stories with less data.
Children efficiently acquire language not just by listening, but by interacting with others in their social environment. Conversely, large language models are typically trained with next-word prediction on massive amounts of text. Motivated by this contrast, we investigate whether language models can be trained with less data by learning not only from next-word prediction but also from high-level, cognitively inspired feedback. We train a student model to generate stories, which a teacher model rates on readability, narrative coherence, and creativity. By varying the amount of pretraining before the feedback loop, we assess the impact of this interactive learning on formal and functional linguistic competence. We find that the high-level feedback is highly data efficient: With just 1 M words of input in interactive learning, storytelling skills can improve as much as with 410 M words of next-word prediction.
Similar Papers
Towards Data-Efficient Language Models: A Child-Inspired Approach to Language Learning
Computation and Language
Teaches computers to learn language like kids.
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Computation and Language
Teaches computers to learn language like babies.
Listening with Language Models: Using LLMs to Collect and Interpret Classroom Feedback
Computers and Society
AI chatbot helps teachers get better student feedback.