Score: 1

Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion

Published: April 11, 2025 | arXiv ID: 2504.08451v1

By: Weiye Chen, Qingen Zhu, Qian Long

Potential Business Impact:

Makes AI art creation fast on small devices.

Business Areas:
Autonomous Vehicles Transportation

Recent advances in visual synthesis have leveraged diffusion models and attention mechanisms to achieve high-fidelity artistic style transfer and photorealistic text-to-image generation. However, real-time deployment on edge devices remains challenging due to computational and memory constraints. We propose Muon-AD, a co-designed framework that integrates the Muon optimizer with attention distillation for real-time edge synthesis. By eliminating gradient conflicts through orthogonal parameter updates and dynamic pruning, Muon-AD achieves 3.2 times faster convergence compared to Stable Diffusion-TensorRT, while maintaining synthesis quality (15% lower FID, 4% higher SSIM). Our framework reduces peak memory to 7GB on Jetson Orin and enables 24FPS real-time generation through mixed-precision quantization and curriculum learning. Extensive experiments on COCO-Stuff and ImageNet-Texture demonstrate Muon-AD's Pareto-optimal efficiency-quality trade-offs. Here, we show a 65% reduction in communication overhead during distributed training and real-time 10s/image generation on edge GPUs. These advancements pave the way for democratizing high-quality visual synthesis in resource-constrained environments.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
11 pages

Category
Computer Science:
CV and Pattern Recognition