Score: 0

Improving Slow Transfer Predictions: Generative Methods Compared

Published: December 16, 2025 | arXiv ID: 2512.14522v1

By: Jacob Taegon Kim , Alex Sim , Kesheng Wu and more

Monitoring data transfer performance is a crucial task in scientific computing networks. By predicting performance early in the communication phase, potentially sluggish transfers can be identified and selectively monitored, optimizing network usage and overall performance. A key bottleneck to improving the predictive power of machine learning (ML) models in this context is the issue of class imbalance. This project focuses on addressing the class imbalance problem to enhance the accuracy of performance predictions. In this study, we analyze and compare various augmentation strategies, including traditional oversampling methods and generative techniques. Additionally, we adjust the class imbalance ratios in training datasets to evaluate their impact on model performance. While augmentation may improve performance, as the imbalance ratio increases, the performance does not significantly improve. We conclude that even the most advanced technique, such as CTGAN, does not significantly improve over simple stratified sampling.

From Data to Decision: A Multi-Stage Framework for Class Imbalance Mitigation in Optical Network Failure Analysis

Machine Learning (CS)

Finds internet problems faster and more accurately.

25 Aug 2025 0

86%

Cyber Security Data Science: Machine Learning Methods and their Performance on Imbalanced Datasets

Machine Learning (CS)

Finds computer threats faster by trying different tricks.

7 May 2025 1

86%

Regression Augmentation With Data-Driven Segmentation

Machine Learning (CS)

Makes AI predict rare cases accurately

2 Aug 2025 1

View PDF Login to Bookmark

Improving Slow Transfer Predictions: Generative Methods Compared

Technical Abstract

From Data to Decision: A Multi-Stage Framework for Class Imbalance Mitigation in Optical Network Failure Analysis

Cyber Security Data Science: Machine Learning Methods and their Performance on Imbalanced Datasets

Regression Augmentation With Data-Driven Segmentation