Score: 0

Explaining Fine Tuned LLMs via Counterfactuals A Knowledge Graph Driven Framework

Published: September 25, 2025 | arXiv ID: 2509.21241v1

By: Yucheng Wang, Ziyang Chen, Md Faisal Kabir

Potential Business Impact:

Explains how smart programs learn new skills.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The widespread adoption of Low-Rank Adaptation (LoRA) has enabled large language models (LLMs) to acquire domain-specific knowledge with remarkable efficiency. However, understanding how such a fine-tuning mechanism alters a model's structural reasoning and semantic behavior remains an open challenge. This work introduces a novel framework that explains fine-tuned LLMs via counterfactuals grounded in knowledge graphs. Specifically, we construct BioToolKG, a domain-specific heterogeneous knowledge graph in bioinformatics tools and design a counterfactual-based fine-tuned LLMs explainer (CFFTLLMExplainer) that learns soft masks over graph nodes and edges to generate minimal structural perturbations that induce maximum semantic divergence. Our method jointly optimizes structural sparsity and semantic divergence while enforcing interpretability preserving constraints such as entropy regularization and edge smoothness. We apply this framework to a fine-tuned LLaMA-based LLM and reveal that counterfactual masking exposes the model's structural dependencies and aligns with LoRA-induced parameter shifts. This work provides new insights into the internal mechanisms of fine-tuned LLMs and highlights counterfactual graphs as a potential tool for interpretable AI.

Guiding LLMs to Generate High-Fidelity and High-Quality Counterfactual Explanations for Text Classification

Computation and Language

Makes AI explain its decisions with small changes.

6 Mar 2025 1

90%

LLMs Struggle to Perform Counterfactual Reasoning with Parametric Knowledge

Artificial Intelligence

Computers can't easily mix old and new facts.

15 Jun 2025 0

90%

Accuracy and Efficiency Trade-Offs in LLM-Based Malware Detection and Explanation: A Comparative Study of Parameter Tuning vs. Full Fine-Tuning

Cryptography and Security

Helps computers explain why files are bad.

24 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

16 pages

Explaining Fine Tuned LLMs via Counterfactuals A Knowledge Graph Driven Framework

Explains how smart programs learn new skills.

Technical Abstract

Guiding LLMs to Generate High-Fidelity and High-Quality Counterfactual Explanations for Text Classification

LLMs Struggle to Perform Counterfactual Reasoning with Parametric Knowledge

Accuracy and Efficiency Trade-Offs in LLM-Based Malware Detection and Explanation: A Comparative Study of Parameter Tuning vs. Full Fine-Tuning