Score: 0

Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs

Published: May 20, 2025 | arXiv ID: 2505.14796v1

By: Melanie Cornelius , Greg Cross , Shilpika Shilpika and more

Potential Business Impact:

Saves computer energy by watching how it works.

Business Areas:
Big Data Data and Analytics

As supercomputers grow in size and complexity, power efficiency has become a critical challenge, particularly in understanding GPU power consumption within modern HPC workloads. This work addresses this challenge by presenting a data co-analysis approach using system data collected from the Polaris supercomputer at Argonne National Laboratory. We focus on GPU utilization and power demands, navigating the complexities of large-scale, heterogeneous datasets. Our approach, which incorporates data preprocessing, post-processing, and statistical methods, condenses the data volume by 94% while preserving essential insights. Through this analysis, we uncover key opportunities for power optimization, such as reducing high idle power costs, applying power strategies at the job-level, and aligning GPU power allocation with workload demands. Our findings provide actionable insights for energy-efficient computing and offer a practical, reproducible approach for applying existing research to optimize system performance.

Country of Origin
πŸ‡ΊπŸ‡Έ United States

Page Count
12 pages

Category
Computer Science:
Distributed, Parallel, and Cluster Computing