Score: 0

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

Published: December 24, 2025 | arXiv ID: 2512.20908v1

By: Kaiyuan Liu , Shaotian Yan , Rui Miao and more

Potential Business Impact:

Helps small AI learn to think like big AI.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Reasoning distillation has attracted increasing attention. It typically leverages a large teacher model to generate reasoning paths, which are then used to fine-tune a student model so that it mimics the teacher's behavior in training contexts. However, previous approaches have lacked a detailed analysis of the origins of the distilled model's capabilities. It remains unclear whether the student can maintain consistent behaviors with the teacher in novel test-time contexts, or whether it regresses to its original output patterns, raising concerns about the generalization of distillation models. To analyse this question, we introduce a cross-model Reasoning Distillation Provenance Tracing framework. For each action (e.g., a sentence) produced by the distilled model, we obtain the predictive probabilities assigned by the teacher, the original student, and the distilled model under the same context. By comparing these probabilities, we classify each action into different categories. By systematically disentangling the provenance of each action, we experimentally demonstrate that, in test-time contexts, the distilled model can indeed generate teacher-originated actions, which correlate with and plausibly explain observed performance on distilled model. Building on this analysis, we further propose a teacher-guided data selection method. Unlike prior approach that rely on heuristics, our method directly compares teacher-student divergences on the training data, providing a principled selection criterion. We validate the effectiveness of our approach across multiple representative teacher models and diverse student models. The results highlight the utility of our provenance-tracing framework and underscore its promise for reasoning distillation. We hope to share Reasoning Distillation Provenance Tracing and our insights into reasoning distillation with the community.

From Reasoning LLMs to BERT: A Two-Stage Distillation Framework for Search Relevance

Information Retrieval

Makes online shopping search faster and smarter.

13 Oct 2025 1

89%

Towards Understanding Distilled Reasoning Models: A Representational Approach

Machine Learning (CS)

Teaches AI to think smarter and check its work.

5 Mar 2025 1

89%

Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation

Computation and Language

Makes AI learn faster by focusing on thinking steps.

24 Dec 2025 4

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

20 pages

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

Helps small AI learn to think like big AI.

Technical Abstract

From Reasoning LLMs to BERT: A Two-Stage Distillation Framework for Search Relevance

Towards Understanding Distilled Reasoning Models: A Representational Approach

Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation