Decoding Human-LLM Collaboration in Coding: An Empirical Study of Multi-Turn Conversations in the Wild
By: Binquan Zhang , Li Zhang , Haoyuan Zhang and more
Potential Business Impact:
Helps computers write better code with people.
Large language models (LLMs) are increasingly acting as dynamic conversational interfaces, supporting multi-turn interactions that mimic human-like conversation and facilitate complex tasks like coding. While datasets such as LMSYS-Chat-1M and WildChat capture real-world user-LLM conversations, few studies systematically explore the mechanisms of human-LLM collaboration in coding scenarios. What tortuous paths do users experience during the interaction process? How well do the LLMs follow instructions? Are users satisfied? In this paper, we conduct an empirical analysis on human-LLM coding collaboration using LMSYS-Chat-1M and WildChat datasets to explore the human-LLM collaboration mechanism, LLMs' instruction following ability, and human satisfaction. This study yields interesting findings: 1) Task types shape interaction patterns(linear, star and tree), with code quality optimization favoring linear patterns, design-driven tasks leaning toward tree structures, and queries preferring star patterns; 2) Bug fixing and code refactoring pose greater challenges to LLMs' instruction following, with non-compliance rates notably higher than in information querying; 3) Code quality optimization and requirements-driven development tasks show lower user satisfaction, whereas structured knowledge queries and algorithm designs yield higher levels. These insights offer recommendations for improving LLM interfaces and user satisfaction in coding collaborations, while highlighting avenues for future research on adaptive dialogue systems. We believe this work broadens understanding of human-LLM synergies and supports more effective AI-assisted development.
Similar Papers
Developer-LLM Conversations: An Empirical Study of Interactions and Generated Code Quality
Software Engineering
Helps computers write better code by fixing mistakes.
NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification
Human-Computer Interaction
Helps people tell computers what to do better.
Model-Assisted and Human-Guided: Perceptions and Practices of Software Professionals Using LLMs for Coding
Software Engineering
Helps coders build software faster and smarter.