Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective
By: You Zhang , Jin Wang , Liang-Chih Yu and more
Potential Business Impact:
Makes computer language models understand text better.
Current neural networks often employ multi-domain-learning or attribute-injecting mechanisms to incorporate non-independent and identically distributed (non-IID) information for text understanding tasks by capturing individual characteristics and the relationships among samples. However, the extent of the impact of non-IID information and how these methods affect pre-trained language models (PLMs) remains unclear. This study revisits the assumption that non-IID information enhances PLMs to achieve performance improvements from a Bayesian perspective, which unearths and integrates non-IID and IID features. Furthermore, we proposed a multi-attribute multi-grained framework for PLM adaptations (M2A), which combines multi-attribute and multi-grained views to mitigate uncertainty in a lightweight manner. We evaluate M2A through prevalent text-understanding datasets and demonstrate its superior performance, mainly when data are implicitly non-IID, and PLMs scale larger.
Similar Papers
Adaptive Federated Distillation for Multi-Domain Non-IID Textual Data
Computation and Language
Helps AI learn from many different kinds of text.
Towards Multi-modal Graph Large Language Model
Machine Learning (CS)
Teaches computers to understand many kinds of connected information.
MLaGA: Multimodal Large Language and Graph Assistant
Artificial Intelligence
Helps computers understand pictures and words together.