Score: 0

Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts

Published: April 19, 2025 | arXiv ID: 2504.14375v1

By: Kun Qian , Maximillian Chen , Siyan Li and more

Potential Business Impact:

Creates better AI chatbots by building conversations step-by-step.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Training conversational question-answering (QA) systems requires a substantial amount of in-domain data, which is often scarce in practice. A common solution to this challenge is to generate synthetic data. Traditional methods typically follow a top-down approach, where a large language model (LLM) generates multi-turn dialogues from a broad prompt. Although this method produces coherent conversations, it offers limited fine-grained control over the content and is susceptible to hallucinations. We introduce a bottom-up conversation synthesis approach, where QA pairs are generated first and then combined into a coherent dialogue. This method offers greater control and precision by dividing the process into two distinct steps, allowing refined instructions and validations to be handled separately. Additionally, this structure allows the use of non-local models in stages that do not involve proprietary knowledge, enhancing the overall quality of the generated data. Both human and automated evaluations demonstrate that our approach produces more realistic and higher-quality dialogues compared to top-down methods.

Synthetic Clarification and Correction Dialogues about Data-Centric Tasks -- A Teacher-Student Approach

Computation and Language

AI learns to ask questions and fix mistakes.

18 Mar 2025 1

89%

Think Less, Label Better: Multi-Stage Domain-Grounded Synthetic Data Generation for Fine-Tuning Large Language Models in Telecommunications

Computation and Language

Makes AI learn hard jobs without people.

30 Sep 2025 0

89%

BMGQ: A Bottom-up Method for Generating Complex Multi-hop Reasoning Questions from Semi-structured Data

Artificial Intelligence

Makes computers better at answering tricky questions.

28 Oct 2025 1

View PDF Login to Bookmark

Page Count

18 pages

Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts

Creates better AI chatbots by building conversations step-by-step.

Technical Abstract

Synthetic Clarification and Correction Dialogues about Data-Centric Tasks -- A Teacher-Student Approach

Think Less, Label Better: Multi-Stage Domain-Grounded Synthetic Data Generation for Fine-Tuning Large Language Models in Telecommunications

BMGQ: A Bottom-up Method for Generating Complex Multi-hop Reasoning Questions from Semi-structured Data