Score: 1

Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

Published: January 16, 2026 | arXiv ID: 2601.11258v1

By: Pingzhi Tang, Yiding Wang, Muhan Zhang

Potential Business Impact:

Teaches AI new facts and how to use them.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Language Models (LLMs) face the "knowledge cutoff" challenge, where their frozen parametric memory prevents direct internalization of new information. While Supervised Fine-Tuning (SFT) is commonly used to update model knowledge, it often updates factual content without reliably improving the model's ability to use the newly incorporated information for question answering or decision-making. Reinforcement Learning (RL) is essential for acquiring reasoning skills; however, its high computational cost makes it impractical for efficient online adaptation. We empirically observe that the parameter updates induced by SFT and RL are nearly orthogonal. Based on this observation, we propose Parametric Skill Transfer (PaST), a framework that supports modular skill transfer for efficient and effective knowledge adaptation. By extracting a domain-agnostic Skill Vector from a source domain, we can linearly inject knowledge manipulation skills into a target model after it has undergone lightweight SFT on new data. Experiments on knowledge-incorporation QA (SQuAD, LooGLE) and agentic tool-use benchmarks (ToolBench) demonstrate the effectiveness of our method. On SQuAD, PaST outperforms the state-of-the-art self-editing SFT baseline by up to 9.9 points. PaST further scales to long-context QA on LooGLE with an 8.0-point absolute accuracy gain, and improves zero-shot ToolBench success rates by +10.3 points on average with consistent gains across tool categories, indicating strong scalability and cross-domain transferability of the Skill Vector.

Mitigating Forgetting Between Supervised and Reinforcement Learning Yields Stronger Reasoners

Computation and Language

Makes AI smarter by learning from mistakes.

6 Oct 2025 1

90%

Easy Adaptation: An Efficient Task-Specific Knowledge Injection Method for Large Models in Resource-Constrained Environments

Machine Learning (CS)

Makes big AI models work better with less effort.

19 Dec 2025 0

90%

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Machine Learning (CS)

Keeps AI smart while teaching it new things.

5 Jan 2026 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

27 pages

Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

Teaches AI new facts and how to use them.

Technical Abstract

Mitigating Forgetting Between Supervised and Reinforcement Learning Yields Stronger Reasoners

Easy Adaptation: An Efficient Task-Specific Knowledge Injection Method for Large Models in Resource-Constrained Environments

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting