AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
By: Yangning Li , Shaoshen Chen , Yinghui Li and more
Potential Business Impact:
Makes computers understand long stories better.
The quadratic complexity of self-attention constrains Large Language Models (LLMs) in processing long contexts, a capability essential for many advanced applications. Context compression aims to alleviate this computational bottleneck while retaining critical semantic information. However, existing approaches often fall short: explicit methods may compromise local detail, whereas implicit methods can suffer from positional biases, information degradation, or an inability to capture long-range semantic dependencies. We propose AdmTree, a novel framework for adaptive, hierarchical context compression with a central focus on preserving high semantic fidelity while maintaining efficiency. AdmTree dynamically segments input based on information density, utilizing gist tokens to summarize variable-length segments as the leaves of a semantic binary tree. This structure, together with a lightweight aggregation mechanism and a frozen backbone LLM (thereby minimizing new trainable parameters), enables efficient hierarchical abstraction of the context. By preserving fine-grained details alongside global semantic coherence, mitigating positional bias, and dynamically adapting to content, AdmTree robustly retains the semantic information of long contexts.
Similar Papers
Sentence-Anchored Gist Compression for Long-Context LLMs
Computation and Language
Makes computers understand longer stories with less effort.
CompLLM: Compression for Long Context Q&A
Computation and Language
Makes AI understand long texts much faster.
Concept than Document: Context Compression via AMR-based Conceptual Entropy
Computation and Language
Makes AI understand long texts by removing extra words.