Score: 1

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

Published: October 13, 2025 | arXiv ID: 2510.11538v2

By: Chaofan Gan , Zicheng Zhao , Yuanpeng Tu and more

Potential Business Impact:

Makes AI pictures have sharper, clearer details.

Business Areas:

Data Mining Data and Analytics, Information Technology

Diffusion Transformers (DiTs) have recently emerged as a powerful backbone for visual generation. Recent observations reveal \emph{Massive Activations} (MAs) in their internal feature maps, yet their function remains poorly understood. In this work, we systematically investigate these activations to elucidate their role in visual generation. We found that these massive activations occur across all spatial tokens, and their distribution is modulated by the input timestep embeddings. Importantly, our investigations further demonstrate that these massive activations play a key role in local detail synthesis, while having minimal impact on the overall semantic content of output. Building on these insights, we propose \textbf{D}etail \textbf{G}uidance (\textbf{DG}), a MAs-driven, training-free self-guidance strategy to explicitly enhance local detail fidelity for DiTs. Specifically, DG constructs a degraded ``detail-deficient'' model by disrupting MAs and leverages it to guide the original network toward higher-quality detail synthesis. Our DG can seamlessly integrate with Classifier-Free Guidance (CFG), enabling further refinements of fine-grained details. Extensive experiments demonstrate that our DG consistently improves fine-grained detail quality across various pre-trained DiTs (\eg, SD3, SD3.5, and Flux).

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

CV and Pattern Recognition

Makes AI pictures show much clearer, tiny details.

13 Oct 2025 1

86%

SuperActivators: Only the Tail of the Distribution Contains Reliable Concept Signals

Machine Learning (CS)

Finds hidden meaning in computer "thoughts."

4 Dec 2025 1

86%

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

CV and Pattern Recognition

Changes pictures using words, better than before.

11 Aug 2025 1

View PDF Login to Bookmark

Page Count

23 pages

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

Makes AI pictures have sharper, clearer details.

Technical Abstract

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

SuperActivators: Only the Tail of the Distribution Contains Reliable Concept Signals

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing