Score: 1

Technical Report: Full-Stack Fine-Tuning for the Q Programming Language

Published: August 9, 2025 | arXiv ID: 2508.06813v2

By: Brendan R. Hogan , Will Brown , Adel Boyarsky and more

Potential Business Impact:

Teaches AI to code in rare computer languages.

Even though large language models are becoming increasingly capable, it is still unreasonable to expect them to excel at tasks that are under-represented on the Internet. Leveraging LLMs for specialized applications, particularly in niche programming languages and private domains, remains challenging and largely unsolved. In this work, we address this gap by presenting a comprehensive, open-source approach for adapting LLMs to the Q programming language, a popular tool in quantitative finance that is much less present on the Internet compared to Python, C, Java, and other ``mainstream" languages and is therefore not a strong suit of general-purpose AI models. We introduce a new Leetcode style evaluation dataset for Q, benchmark major frontier models on the dataset, then do pretraining, supervised fine tuning, and reinforcement learning to train a suite of reasoning and non-reasoning models based on the Qwen-2.5 series, spanning five parameter sizes (1.5B, 3B, 7B, 14B, 32B). Our best model achieves a pass@1 accuracy of 59 percent on our Q benchmark, surpassing the best-performing frontier model, Claude Opus-4 by 29.5 percent. Additionally, all models, even our 1.5B model, outperform GPT-4.1 on this task. In addition to releasing models, code, and data, we provide a detailed blueprint for dataset construction, model pretraining, supervised fine-tuning, and reinforcement learning. Our methodology is broadly applicable, and we discuss how these techniques can be extended to other tasks, including those where evaluation may rely on soft or subjective signals.

Fine-tuning of lightweight large language models for sentiment classification on heterogeneous financial textual data

Computation and Language

Small AI models understand money news well.

30 Nov 2025 0

89%

QCoder Benchmark: Bridging Language Generation and Quantum Hardware through Simulator-Based Feedback

Computation and Language

Helps computers write code for quantum machines.

30 Oct 2025 0

89%

Tracing Positional Bias in Financial Decision-Making: Mechanistic Insights from Qwen2.5

Computational Finance

Finds hidden bias in money-making computer programs.

25 Aug 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com github.com

Page Count

40 pages

Technical Report: Full-Stack Fine-Tuning for the Q Programming Language

Teaches AI to code in rare computer languages.

Technical Abstract

Fine-tuning of lightweight large language models for sentiment classification on heterogeneous financial textual data

QCoder Benchmark: Bridging Language Generation and Quantum Hardware through Simulator-Based Feedback

Tracing Positional Bias in Financial Decision-Making: Mechanistic Insights from Qwen2.5