Prompt reinforcing for long-term planning of large language models
By: Hsien-Chin Lin , Benjamin Matthias Ruppik , Carel van Niekerk and more
Potential Business Impact:
Helps computers remember conversations to finish tasks.
Large language models (LLMs) have achieved remarkable success in a wide range of natural language processing tasks and can be adapted through prompting. However, they remain suboptimal in multi-turn interactions, often relying on incorrect early assumptions and failing to track user goals over time, which makes such tasks particularly challenging. Prior works in dialogue systems have shown that long-term planning is essential for handling interactive tasks. In this work, we propose a prompt optimisation framework inspired by reinforcement learning, which enables such planning to take place by only modifying the task instruction prompt of the LLM-based agent. By generating turn-by-turn feedback and leveraging experience replay for prompt rewriting, our proposed method shows significant improvement in multi-turn tasks such as text-to-SQL and task-oriented dialogue. Moreover, it generalises across different LLM-based agents and can leverage diverse LLMs as meta-prompting agents. This warrants future research in reinforcement learning-inspired parameter-free optimisation methods.
Similar Papers
System Prompt Optimization with Meta-Learning
Computation and Language
Makes AI understand instructions better for any task.
Leveraging Pre-trained Large Language Models with Refined Prompting for Online Task and Motion Planning
Robotics
Robots learn to fix mistakes while working.
PromptFlow: Training Prompts Like Neural Networks
Artificial Intelligence
Teaches computers to write better instructions automatically.