Score: 0

Federated Learning with Ad-hoc Adapter Insertions: The Case of Soft-Embeddings for Training Classifier-as-Retriever

Published: September 20, 2025 | arXiv ID: 2509.16508v1

By: Marijan Fofonjka , Shahryar Zehtabi , Alireza Behtash and more

Potential Business Impact:

Makes AI learn new things on small devices.

Business Areas:

Semantic Search Internet Services

When existing retrieval-augmented generation (RAG) solutions are intended to be used for new knowledge domains, it is necessary to update their encoders, which are taken to be pretrained large language models (LLMs). However, fully finetuning these large models is compute- and memory-intensive, and even infeasible when deployed on resource-constrained edge devices. We propose a novel encoder architecture in this work that addresses this limitation by using a frozen small language model (SLM), which satisfies the memory constraints of edge devices, and inserting a small adapter network before the transformer blocks of the SLM. The trainable adapter takes the token embeddings of the new corpus and learns to produce enhanced soft embeddings for it, while requiring significantly less compute power to update than full fine-tuning. We further propose a novel retrieval mechanism by attaching a classifier head to the SLM encoder, which is trained to learn a similarity mapping of the input embeddings to their corresponding documents. Finally, to enable the online fine-tuning of both (i) the encoder soft embeddings and (ii) the classifier-as-retriever on edge devices, we adopt federated learning (FL) and differential privacy (DP) to achieve an efficient, privacy-preserving, and product-grade training solution. We conduct a theoretical analysis of our methodology, establishing convergence guarantees under mild assumptions on gradient variance when deployed for general smooth nonconvex loss functions. Through extensive numerical experiments, we demonstrate (i) the efficacy of obtaining soft embeddings to enhance the encoder, (ii) training a classifier to improve the retriever, and (iii) the role of FL in achieving speedup.

Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation

Computation and Language

Keeps private info safe while AI learns.

27 Apr 2025 3

88%

Adaptation of Embedding Models to Financial Filings via LLM Distillation

Computation and Language

Teaches AI to find specific money information faster.

8 Dec 2025 0

88%

Efficient Split Federated Learning for Large Language Models over Communication Networks

Machine Learning (CS)

Makes smart computer programs train faster on phones.

20 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

22 pages

Federated Learning with Ad-hoc Adapter Insertions: The Case of Soft-Embeddings for Training Classifier-as-Retriever

Makes AI learn new things on small devices.

Technical Abstract

Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation

Adaptation of Embedding Models to Financial Filings via LLM Distillation

Efficient Split Federated Learning for Large Language Models over Communication Networks