Score: 1

Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches

Published: December 14, 2025 | arXiv ID: 2512.12677v1

By: Amirhossein Yousefiramandi, Ciaran Cooney

Potential Business Impact:

Makes big AI models learn new jobs with less power.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

We explore efficient strategies to fine-tune decoder-only Large Language Models (LLMs) for downstream text classification under resource constraints. Two approaches are investigated: (1) attaching a classification head to a pre-trained causal LLM and fine-tuning on the task (using the LLM's final token embedding as a sequence representation), and (2) instruction-tuning the LLM in a prompt->response format for classification. To enable single-GPU fine-tuning of models up to 8B parameters, we combine 4-bit model quantization with Low-Rank Adaptation (LoRA) for parameter-efficient training. Experiments on two datasets - a proprietary single-label dataset and the public WIPO-Alpha patent dataset (extreme multi-label classification) - show that the embedding-based method significantly outperforms the instruction-tuned method in F1-score, and is very competitive with - even surpassing - fine-tuned domain-specific models (e.g. BERT) on the same tasks. These results demonstrate that directly leveraging the internal representations of causal LLMs, along with efficient fine-tuning techniques, yields impressive classification performance under limited computational resources. We discuss the advantages of each approach while outlining practical guidelines and future directions for optimizing LLM fine-tuning in classification scenarios.

Regularization Through Reasoning: Systematic Improvements in Language Model Classification via Explanation-Enhanced Fine-Tuning

Machine Learning (CS)

Makes AI better at choosing the right answer.

3 Nov 2025 0

89%

Adaptation of Embedding Models to Financial Filings via LLM Distillation

Computation and Language

Teaches AI to find specific money information faster.

8 Dec 2025 0

89%

Accuracy and Efficiency Trade-Offs in LLM-Based Malware Detection and Explanation: A Comparative Study of Parameter Tuning vs. Full Fine-Tuning

Cryptography and Security

Helps computers explain why files are bad.

24 Nov 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com github.com

Page Count

18 pages

Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches

Makes big AI models learn new jobs with less power.

Technical Abstract

Regularization Through Reasoning: Systematic Improvements in Language Model Classification via Explanation-Enhanced Fine-Tuning

Adaptation of Embedding Models to Financial Filings via LLM Distillation

Accuracy and Efficiency Trade-Offs in LLM-Based Malware Detection and Explanation: A Comparative Study of Parameter Tuning vs. Full Fine-Tuning