Score: 1

Benchmarking Transferability: A Framework for Fair and Robust Evaluation

Published: April 28, 2025 | arXiv ID: 2504.20121v1

By: Alireza Kazemi, Helia Rezvani, Mahsa Baktashmotlagh

Potential Business Impact:

Tests computer learning to work in new situations.

Business Areas:

Test and Measurement Data and Analytics

Transferability scores aim to quantify how well a model trained on one domain generalizes to a target domain. Despite numerous methods proposed for measuring transferability, their reliability and practical usefulness remain inconclusive, often due to differing experimental setups, datasets, and assumptions. In this paper, we introduce a comprehensive benchmarking framework designed to systematically evaluate transferability scores across diverse settings. Through extensive experiments, we observe variations in how different metrics perform under various scenarios, suggesting that current evaluation practices may not fully capture each method's strengths and limitations. Our findings underscore the value of standardized assessment protocols, paving the way for more reliable transferability measures and better-informed model selection in cross-domain applications. Additionally, we achieved a 3.5\% improvement using our proposed metric for the head-training fine-tuning experimental setup. Our code is available in this repository: https://github.com/alizkzm/pert_robust_platform.

How NOT to benchmark your SITE metric: Beyond Static Leaderboards and Towards Realistic Evaluation

Machine Learning (CS)

Finds better computer brains without retraining them.

7 Oct 2025 0

87%

What Does Your Benchmark Really Measure? A Framework for Robust Inference of AI Capabilities

Artificial Intelligence

Makes AI tests show how smart AI really is.

23 Sep 2025 2

86%

Toward Realistic Adversarial Attacks in IDS: A Novel Feasibility Metric for Transferability

Cryptography and Security

Makes computer security systems easier to trick.

11 Apr 2025 1

View PDF Login to Bookmark

Country of Origin

🇦🇺 Australia

Repos / Data Links

github.com

Page Count

10 pages

Benchmarking Transferability: A Framework for Fair and Robust Evaluation

Tests computer learning to work in new situations.

Technical Abstract

How NOT to benchmark your SITE metric: Beyond Static Leaderboards and Towards Realistic Evaluation

What Does Your Benchmark Really Measure? A Framework for Robust Inference of AI Capabilities

Toward Realistic Adversarial Attacks in IDS: A Novel Feasibility Metric for Transferability