Score: 1

Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning

Published: August 6, 2025 | arXiv ID: 2508.04901v2

By: Prabhav Singh, Jessica Sorrell

BigTech Affiliations: Johns Hopkins University

Potential Business Impact:

Makes smart computer learning more trustworthy.

The widespread adoption of transfer learning has revolutionized machine learning by enabling efficient adaptation of pre-trained models to new domains. However, the reliability of these adaptations remains poorly understood, particularly when using adaptive data selection strategies that dynamically prioritize training examples. We present a comprehensive theoretical and empirical analysis of replicability in transfer learning, introducing a mathematical framework that quantifies the fundamental trade-off between adaptation effectiveness and result consistency. Our key contribution is the formalization of selection sensitivity ($\Delta_Q$), a measure that captures how adaptive selection strategies respond to perturbations in training data. We prove that replicability failure probability: the likelihood that two independent training runs produce models differing in performance by more than a threshold, increases quadratically with selection sensitivity while decreasing exponentially with sample size. Through extensive experiments on the MultiNLI corpus using six adaptive selection strategies - ranging from uniform sampling to gradient-based selection - we demonstrate that this theoretical relationship holds precisely in practice. Our results reveal that highly adaptive strategies like gradient-based and curriculum learning achieve superior task performance but suffer from high replicability failure rates, while less adaptive approaches maintain failure rates below 7%. Crucially, we show that source domain pretraining provides a powerful mitigation mechanism, reducing failure rates by up to 30% while preserving performance gains. These findings establish principled guidelines for practitioners to navigate the performance-replicability trade-off and highlight the need for replicability-aware design in modern transfer learning systems.

Learning under Distributional Drift: Reproducibility as an Intrinsic Statistical Resource

Machine Learning (CS)

Limits how fast learning systems can adapt.

15 Dec 2025 0

86%

Approximate Replicability in Learning

Machine Learning (CS)

Makes computer learning work even with messy data.

23 Oct 2025 0

85%

Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning

Machine Learning (CS)

Helps computers know when they are wrong.

11 Aug 2025 2

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

24 pages

Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning

Makes smart computer learning more trustworthy.

Technical Abstract

Learning under Distributional Drift: Reproducibility as an Intrinsic Statistical Resource

Approximate Replicability in Learning

Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning