Magneton: Optimizing Energy Efficiency of ML Systems via Differential Energy Debugging
By: Yi Pan , Wenbo Qian , Dedong Xie and more
Potential Business Impact:
Finds wasted computer energy in AI programs.
The training and deployment of machine learning (ML) models have become extremely energy-intensive. While existing optimization efforts focus primarily on hardware energy efficiency, a significant but overlooked source of inefficiency is software energy waste caused by poor software design. This often includes redundant or poorly designed operations that consume more energy without improving performance. These inefficiencies arise in widely used ML frameworks and applications, yet developers often lack the visibility and tools to detect and diagnose them. We propose differential energy debugging, a novel approach that leverages the observation that competing ML systems often implement similar functionality with vastly different energy consumption. Building on this insight, we design and implement Magneton, an energy profiler that compares energy consumption between similar ML systems at the operator level and automatically pinpoints code regions and configuration choices responsible for excessive energy use. Applied to 9 popular ML systems spanning LLM inference, general ML frameworks, and image generation, Magneton detects and diagnoses 16 known cases of software energy inefficiency and further discovers 8 previously unknown cases, 7 of which have been confirmed by developers.
Similar Papers
Runtime Energy Monitoring for RISC-V Soft-Cores
Hardware Architecture
Measures computer energy use without complex math.
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Machine Learning (CS)
Measures AI energy use, helps save power.
Compression-Induced Communication-Efficient Large Model Training and Inferencing
Machine Learning (CS)
Saves energy training smart computer programs.