Enhancing LLM Knowledge Learning through Generalization
By: Mingkang Zhu , Xi Chen , Zhongdao Wang and more
Potential Business Impact:
Helps computers remember new facts without forgetting old ones.
As Large language models (LLMs) are increasingly deployed in diverse applications, faithfully integrating evolving factual knowledge into these models remains a critical challenge. Continued pre-training on paraphrased data has shown empirical promise for enhancing knowledge acquisition. However, this approach is often costly and unreliable, as it relies on external models or manual effort for rewriting, and may inadvertently alter the factual content. In this work, we hypothesize and empirically show that an LLM's ability to continually predict the same factual knowledge tokens given diverse paraphrased contexts is positively correlated with its capacity to extract that knowledge via question-answering. Based on this view and aiming to improve generalization to diverse paraphrased contexts, we introduce two strategies to enhance LLMs' ability to predict the same knowledge tokens given varied contexts, thereby enhancing knowledge acquisition. First, we propose formatting-based data augmentation, which diversifies documents conveying the same knowledge by altering document formats rather than their content, thereby preserving factual integrity. Second, we adopt sharpness-aware minimization as the optimizer to better improve generalization. Extensive experiments demonstrate our methods' effectiveness in both continued pre-training and instruction tuning, and further gains can be achieved by combining with paraphrased data.
Similar Papers
Semantic Mastery: Enhancing LLMs with Advanced Natural Language Understanding
Computation and Language
Makes AI understand and talk like people.
Rethinking Data: Towards Better Performing Domain-Specific Small Language Models
Computation and Language
Makes small AI models answer questions as well as big ones.
Question Answering under Temporal Conflict: Evaluating and Organizing Evolving Knowledge with LLMs
Computation and Language
Helps computers remember and use new facts.