SCRIBE: Structured Chain Reasoning for Interactive Behaviour Explanations using Tool Calling
By: Fares Fawzi , Vinitra Swamy , Dominik Glandorf and more
Potential Business Impact:
Helps teachers give better, private student feedback.
Language models can be used to provide interactive, personalized student feedback in educational settings. However, real-world deployment faces three key challenges: privacy concerns, limited computational resources, and the need for pedagogically valid responses. These constraints require small, open-source models that can run locally and reliably ground their outputs in correct information. We introduce SCRIBE, a framework for multi-hop, tool-augmented reasoning designed to generate valid responses to student questions about feedback reports. SCRIBE combines domain-specific tools with a self-reflective inference pipeline that supports iterative reasoning, tool use, and error recovery. We distil these capabilities into 3B and 8B models via two-stage LoRA fine-tuning on synthetic GPT-4o-generated data. Evaluation with a human-aligned GPT-Judge and a user study with 108 students shows that 8B-SCRIBE models achieve comparable or superior quality to much larger models in key dimensions such as relevance and actionability, while being perceived on par with GPT-4o and Llama-3.3 70B by students. These findings demonstrate the viability of SCRIBE for low-resource, privacy-sensitive educational applications.
Similar Papers
SCRIBE: Structured Mid-Level Supervision for Tool-Using Language Models
Artificial Intelligence
Teaches robots to use tools better.
BRAID: Bounded Reasoning for Autonomous Inference and Decisions
Computation and Language
Makes AI think smarter and cheaper.
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios
Computation and Language
Tests how well computers can think through many steps.