Score: 1

MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning

Published: October 19, 2025 | arXiv ID: 2510.16797v1

By: Vera Pavlova, Mohammed Makhlouf

Potential Business Impact:

Makes computer language models understand special topics.

Business Areas:

Semantic Search Internet Services

We introduce MOSAIC (Masked Objective with Selective Adaptation for In-domain Contrastive learning), a multi-stage framework for domain adaptation of sentence embedding models that incorporates joint domain-specific masked supervision. Our approach addresses the challenges of adapting large-scale general-domain sentence embedding models to specialized domains. By jointly optimizing masked language modeling (MLM) and contrastive objectives within a unified training pipeline, our method enables effective learning of domain-relevant representations while preserving the robust semantic discrimination properties of the original model. We empirically validate our approach on both high-resource and low-resource domains, achieving improvements up to 13.4% in NDCG@10 (Normalized Discounted Cumulative Gain) over strong general-domain baselines. Comprehensive ablation studies further demonstrate the effectiveness of each component, highlighting the importance of balanced joint supervision and staged adaptation.

MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement

CV and Pattern Recognition

Creates realistic pictures with many people.

2 Sep 2025 1

88%

MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding

Computation and Language

Helps computers write complex science code.

9 Oct 2025 0

87%

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations

Computation and Language

Simulates online behavior to fight fake news.

10 Apr 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

14 pages

MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning

Makes computer language models understand special topics.

Technical Abstract

MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement

MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations