Attention Consistency for LLMs Explanation
By: Tian Lan , Jinyuan Xu , Xue He and more
Potential Business Impact:
Shows how smart computer programs make choices.
Understanding the decision-making processes of large language models (LLMs) is essential for their trustworthy development and deployment. However, current interpretability methods often face challenges such as low resolution and high computational cost. To address these limitations, we propose the \textbf{Multi-Layer Attention Consistency Score (MACS)}, a novel, lightweight, and easily deployable heuristic for estimating the importance of input tokens in decoder-based models. MACS measures contributions of input tokens based on the consistency of maximal attention. Empirical evaluations demonstrate that MACS achieves a favorable trade-off between interpretability quality and computational efficiency, showing faithfulness comparable to complex techniques with a 22\% decrease in VRAM usage and 30\% reduction in latency.
Similar Papers
Internalizing Self-Consistency in Language Models: Multi-Agent Consensus Alignment
Artificial Intelligence
Makes AI think more clearly and agree with itself.
Improving Temporal Understanding Logic Consistency in Video-Language Models via Attention Enhancement
CV and Pattern Recognition
Fixes videos so computers understand them better.
Latent Self-Consistency for Reliable Majority-Set Selection in Short- and Long-Answer Reasoning
Computation and Language
Makes AI answers more reliable and trustworthy.