Characterizing and Understanding Energy Footprint and Efficiency of Small Language Model on Edges
By: Md Romyull Islam , Bobin Deng , Nobel Dhar and more
Potential Business Impact:
Makes smart gadgets run AI without internet.
Cloud-based large language models (LLMs) and their variants have significantly influenced real-world applications. Deploying smaller models (i.e., small language models (SLMs)) on edge devices offers additional advantages, such as reduced latency and independence from network connectivity. However, edge devices' limited computing resources and constrained energy budgets challenge efficient deployment. This study evaluates the power efficiency of five representative SLMs - Llama 3.2, Phi-3 Mini, TinyLlama, and Gemma 2 on Raspberry Pi 5, Jetson Nano, and Jetson Orin Nano (CPU and GPU configurations). Results show that Jetson Orin Nano with GPU acceleration achieves the highest energy-to-performance ratio, significantly outperforming CPU-based setups. Llama 3.2 provides the best balance of accuracy and power efficiency, while TinyLlama is well-suited for low-power environments at the cost of reduced accuracy. In contrast, Phi-3 Mini consumes the most energy despite its high accuracy. In addition, GPU acceleration, memory bandwidth, and model architecture are key in optimizing inference energy efficiency. Our empirical analysis offers practical insights for AI, smart systems, and mobile ad-hoc platforms to leverage tradeoffs from accuracy, inference latency, and power efficiency in energy-constrained environments.
Similar Papers
Edge Deployment of Small Language Models, a comprehensive comparison of CPU, GPU and NPU backends
Performance
Makes AI run faster on small, cheap devices.
Camel: Energy-Aware LLM Inference on Resource-Constrained Devices
Networking and Internet Architecture
Makes smart computer programs run faster, using less power.
Understanding the Performance and Power of LLM Inferencing on Edge Accelerators
Distributed, Parallel, and Cluster Computing
Runs smart AI on small computers, not just big ones.