Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion
By: Weiye Chen, Qingen Zhu, Qian Long
Potential Business Impact:
Makes AI art creation fast on small devices.
Recent advances in visual synthesis have leveraged diffusion models and attention mechanisms to achieve high-fidelity artistic style transfer and photorealistic text-to-image generation. However, real-time deployment on edge devices remains challenging due to computational and memory constraints. We propose Muon-AD, a co-designed framework that integrates the Muon optimizer with attention distillation for real-time edge synthesis. By eliminating gradient conflicts through orthogonal parameter updates and dynamic pruning, Muon-AD achieves 3.2 times faster convergence compared to Stable Diffusion-TensorRT, while maintaining synthesis quality (15% lower FID, 4% higher SSIM). Our framework reduces peak memory to 7GB on Jetson Orin and enables 24FPS real-time generation through mixed-precision quantization and curriculum learning. Extensive experiments on COCO-Stuff and ImageNet-Texture demonstrate Muon-AD's Pareto-optimal efficiency-quality trade-offs. Here, we show a 65% reduction in communication overhead during distributed training and real-time 10s/image generation on edge GPUs. These advancements pave the way for democratizing high-quality visual synthesis in resource-constrained environments.
Similar Papers
Latent Danger Zone: Distilling Unified Attention for Cross-Architecture Black-box Attacks
Machine Learning (CS)
Makes computer "eyes" fooled by fake pictures.
MiAD: Mirage Atom Diffusion for De Novo Crystal Generation
Machine Learning (CS)
Finds new crystal materials by adding/removing atoms.
All-atom Diffusion Transformers: Unified generative modelling of molecules and materials
Machine Learning (CS)
Creates new molecules and materials with one tool.