Score: 0

Decoding Human-LLM Collaboration in Coding: An Empirical Study of Multi-Turn Conversations in the Wild

Published: December 11, 2025 | arXiv ID: 2512.10493v1

By: Binquan Zhang , Li Zhang , Haoyuan Zhang and more

Potential Business Impact:

Helps computers write better code with people.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) are increasingly acting as dynamic conversational interfaces, supporting multi-turn interactions that mimic human-like conversation and facilitate complex tasks like coding. While datasets such as LMSYS-Chat-1M and WildChat capture real-world user-LLM conversations, few studies systematically explore the mechanisms of human-LLM collaboration in coding scenarios. What tortuous paths do users experience during the interaction process? How well do the LLMs follow instructions? Are users satisfied? In this paper, we conduct an empirical analysis on human-LLM coding collaboration using LMSYS-Chat-1M and WildChat datasets to explore the human-LLM collaboration mechanism, LLMs' instruction following ability, and human satisfaction. This study yields interesting findings: 1) Task types shape interaction patterns(linear, star and tree), with code quality optimization favoring linear patterns, design-driven tasks leaning toward tree structures, and queries preferring star patterns; 2) Bug fixing and code refactoring pose greater challenges to LLMs' instruction following, with non-compliance rates notably higher than in information querying; 3) Code quality optimization and requirements-driven development tasks show lower user satisfaction, whereas structured knowledge queries and algorithm designs yield higher levels. These insights offer recommendations for improving LLM interfaces and user satisfaction in coding collaborations, while highlighting avenues for future research on adaptive dialogue systems. We believe this work broadens understanding of human-LLM synergies and supports more effective AI-assisted development.

Developer-LLM Conversations: An Empirical Study of Interactions and Generated Code Quality

Software Engineering

Helps computers write better code by fixing mistakes.

12 Sep 2025 0

91%

NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification

Human-Computer Interaction

Helps people tell computers what to do better.

5 Aug 2025 1

90%

Model-Assisted and Human-Guided: Perceptions and Practices of Software Professionals Using LLMs for Coding

Software Engineering

Helps coders build software faster and smarter.

10 Oct 2025 1

View PDF Login to Bookmark

Page Count

12 pages

Decoding Human-LLM Collaboration in Coding: An Empirical Study of Multi-Turn Conversations in the Wild

Helps computers write better code with people.

Technical Abstract

Developer-LLM Conversations: An Empirical Study of Interactions and Generated Code Quality

NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification

Model-Assisted and Human-Guided: Perceptions and Practices of Software Professionals Using LLMs for Coding