CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language Models
By: Lei Jiang, Fan Chen
Potential Business Impact:
Helps make AI training less bad for Earth.
Neural scaling laws have driven the development of increasingly large language models (LLMs) by linking accuracy improvements to growth in parameter count, dataset size, and compute. However, these laws overlook the carbon emissions that scale exponentially with LLM size. This paper presents \textit{CarbonScaling}, an analytical framework that extends neural scaling laws to incorporate both operational and embodied carbon in LLM training. By integrating models for neural scaling, GPU hardware evolution, parallelism optimization, and carbon estimation, \textit{CarbonScaling} quantitatively connects model accuracy to carbon footprint. Results show that while a power-law relationship between accuracy and carbon holds, real-world inefficiencies significantly increase the scaling factor. Hardware technology scaling reduces carbon emissions for small to mid-sized models, but offers diminishing returns for extremely large LLMs due to communication overhead and underutilized GPUs. Training optimizations-especially aggressive critical batch size scaling-help alleviate this inefficiency. \textit{CarbonScaling} offers key insights for training more sustainable and carbon-efficient LLMs.
Similar Papers
Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations
Distributed, Parallel, and Cluster Computing
Makes AI use less electricity and pollution.
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Machine Learning (CS)
Cuts AI's energy use by almost half.
Large Language Model Scaling Laws for Neural Quantum States in Quantum Chemistry
Machine Learning (CS)
Makes quantum computers learn faster and better.