A Survey on Transformer Context Extension: Approaches and Evaluation
By: Yijun Liu , Jinzheng Yu , Yang Xu and more
Potential Business Impact:
Helps computers understand long stories better.
Large language models (LLMs) based on Transformer have been widely applied in the filed of natural language processing (NLP), demonstrating strong performance, particularly in handling short text tasks. However, when it comes to long context scenarios, the performance of LLMs degrades due to some challenges. To alleviate this phenomenon, there is a number of work proposed recently. In this survey, we first list the challenges of applying pre-trained LLMs to process long contexts. Then systematically review the approaches related to long context and propose our taxonomy categorizing them into four main types: positional encoding, context compression, retrieval augmented, and attention pattern. In addition to the approaches, we focus on the evaluation of long context, organizing relevant data, tasks, and metrics based on existing long context benchmarks. Finally, we summarize unresolved issues in the long context domain and put forward our views on future developments.
Similar Papers
Shifting Long-Context LLMs Research from Input to Output
Computation and Language
Helps computers write long, smart stories.
A Comprehensive Survey on Long Context Language Modeling
Computation and Language
Helps computers understand very long stories.
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Computation and Language
Makes AI remember more of what you say.