Convert Language Model into a Value-based Strategic Planner
By: Xiaoyu Wang , Yue Zhao , Qingqing Gu and more
Potential Business Impact:
Helps computers give better emotional support talks.
Emotional support conversation (ESC) aims to alleviate the emotional distress of individuals through effective conversations. Although large language models (LLMs) have obtained remarkable progress on ESC, most of these studies might not define the diagram from the state model perspective, therefore providing a suboptimal solution for long-term satisfaction. To address such an issue, we leverage the Q-learning on LLMs, and propose a framework called straQ*. Our framework allows a plug-and-play LLM to bootstrap the planning during ESC, determine the optimal strategy based on long-term returns, and finally guide the LLM to response. Substantial experiments on ESC datasets suggest that straQ* outperforms many baselines, including direct inference, self-refine, chain of thought, finetuning, and finite state machines.
Similar Papers
Mitigating Strategy Preference Bias in Emotional Support Conversation via Uncertainty Estimations
Computation and Language
Helps computers give better emotional support talks.
Towards Open-Ended Emotional Support Conversations in LLMs via Reinforcement Learning with Future-Oriented Rewards
Artificial Intelligence
Helps computers give better emotional support.
Emotional Support with LLM-based Empathetic Dialogue Generation
Artificial Intelligence
Helps computers give comforting and helpful advice.