Learning Linear Regression with Low-Rank Tasks in-Context
By: Kaito Takanami, Takashi Takahashi, Yoshiyuki Kabashima
Potential Business Impact:
Helps computers learn how to learn tasks.
In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common structure. In this work, we address this problem by analyzing a linear attention model trained on low-rank regression tasks. Within this setting, we precisely characterize the distribution of predictions and the generalization error in the high-dimensional limit. Moreover, we find that statistical fluctuations in finite pre-training data induce an implicit regularization. Finally, we identify a sharp phase transition of the generalization error governed by task structure. These results provide a framework for understanding how transformers learn to learn the task structure.
Similar Papers
Pretrain-Test Task Alignment Governs Generalization in In-Context Learning
Machine Learning (Stat)
Helps computers learn from examples better.
Transformer learns the cross-task prior and regularization for in-context learning
Machine Learning (CS)
Helps computers learn hidden rules from examples.
How Private is Your Attention? Bridging Privacy with In-Context Learning
Machine Learning (Stat)
Lets AI learn new things privately.