Score: 2

Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

Published: July 30, 2025 | arXiv ID: 2507.22729v2

By: Benedikt Roth , Stephan Rappensperger , Tianming Qiu and more

Potential Business Impact:

Makes computers understand whole sentences better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Language Models (LLMs) have become a cornerstone in Natural Language Processing (NLP), achieving impressive performance in text generation. Their token-level representations capture rich, human-aligned semantics. However, pooling these vectors into a text embedding discards crucial information. Nevertheless, many non-generative downstream tasks, such as clustering, classification, or retrieval, still depend on accurate and controllable sentence- or document-level embeddings. We explore several adaptation strategies for pre-trained, decoder-only LLMs: (i) various aggregation techniques for token embeddings, (ii) task-specific prompt engineering, and (iii) text-level augmentation via contrastive fine-tuning. Combining these components yields competitive performance on the English clustering track of the Massive Text Embedding Benchmark (MTEB). An analysis of the attention map further shows that fine-tuning shifts focus from prompt tokens to semantically relevant words, indicating more effective compression of meaning into the final hidden state. Our experiments demonstrate that LLMs can be effectively adapted as text embedding models through a combination of prompt engineering and resource-efficient contrastive fine-tuning on synthetically generated positive pairs.

Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

Computation and Language

Makes computers understand text meaning better.

30 Jul 2025 2

91%

Simplify-This: A Comparative Analysis of Prompt-Based and Fine-Tuned LLMs

Computation and Language

Makes complex writing easier to understand.

9 Jan 2026 3

91%

An Evaluation of Large Language Models on Text Summarization Tasks Using Prompt Engineering Techniques

Computation and Language

Helps computers summarize long articles better.

7 Jul 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

15 pages

Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

Makes computers understand whole sentences better.

Technical Abstract

Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

Simplify-This: A Comparative Analysis of Prompt-Based and Fine-Tuned LLMs

An Evaluation of Large Language Models on Text Summarization Tasks Using Prompt Engineering Techniques