Score: 1

Narrowing the Gap: Supervised Fine-Tuning of Open-Source LLMs as a Viable Alternative to Proprietary Models for Pedagogical Tools

Published: July 7, 2025 | arXiv ID: 2507.05305v1

By: Lorenzo Lee Solano , Charles Koutcheme , Juho Leinonen and more

Potential Business Impact:

Teaches computers to explain coding mistakes better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Frontier Large language models (LLMs) like ChatGPT and Gemini can decipher cryptic compiler errors for novice programmers, but their computational scale, cost, and tendency to over-assist make them problematic for widespread pedagogical adoption. This work demonstrates that smaller, specialised language models, enhanced via Supervised Fine-Tuning (SFT), present a more viable alternative for educational tools. We utilise a new dataset of 40,000 C compiler error explanations, derived from real introductory programming (CS1/2) student-generated programming errors, which we used to fine-tune three open-source models: Qwen3-4B, Llama-3.1-8B, and Qwen3-32B. We performed a dual evaluation, combining expert human reviews with a large-scale automated analysis of 8,000 responses using a validated LLM-as-judge ensemble. Our results show that SFT significantly boosts the pedagogical quality of smaller models, achieving performance comparable to much larger models. We analyse the trade-offs between model size and quality, confirming that fine-tuning compact, efficient models on high-quality, domain-specific data is a potent strategy for creating specialised models to drive educational tools. We provide a replicable methodology to foster broader access to generative AI capabilities in educational contexts.

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

Artificial Intelligence

Makes AI cheaper and faster for everyday tasks.

17 Dec 2025 1

91%

Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality

Computation and Language

Makes AI better at following instructions.

17 Jun 2025 1

91%

Improved Supervised Fine-Tuning for Large Language Models to Mitigate Catastrophic Forgetting

Computation and Language

Keeps AI smart while teaching it new tricks.

11 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇫🇮 🇦🇺 Australia, Finland

Page Count

7 pages

Narrowing the Gap: Supervised Fine-Tuning of Open-Source LLMs as a Viable Alternative to Proprietary Models for Pedagogical Tools

Teaches computers to explain coding mistakes better.

Technical Abstract

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality

Improved Supervised Fine-Tuning for Large Language Models to Mitigate Catastrophic Forgetting