Efficient and Effective In-context Demonstration Selection with Coreset
By: Zihua Wang , Jiarui Wang , Haiyang Xu and more
Potential Business Impact:
Helps AI learn better from fewer examples.
In-context learning (ICL) has emerged as a powerful paradigm for Large Visual Language Models (LVLMs), enabling them to leverage a few examples directly from input contexts. However, the effectiveness of this approach is heavily reliant on the selection of demonstrations, a process that is NP-hard. Traditional strategies, including random, similarity-based sampling and infoscore-based sampling, often lead to inefficiencies or suboptimal performance, struggling to balance both efficiency and effectiveness in demonstration selection. In this paper, we propose a novel demonstration selection framework named Coreset-based Dual Retrieval (CoDR). We show that samples within a diverse subset achieve a higher expected mutual information. To implement this, we introduce a cluster-pruning method to construct a diverse coreset that aligns more effectively with the query while maintaining diversity. Additionally, we develop a dual retrieval mechanism that enhances the selection process by achieving global demonstration selection while preserving efficiency. Experimental results demonstrate that our method significantly improves the ICL performance compared to the existing strategies, providing a robust solution for effective and efficient demonstration selection.
Similar Papers
Enhancing Multimodal In-Context Learning for Image Classification through Coreset Optimization
CV and Pattern Recognition
Makes AI learn faster with fewer examples.
Data-Efficient Biomedical In-Context Learning: A Diversity-Enhanced Submodular Perspective
Computation and Language
Helps AI learn new medical jobs faster.
Learn to Select: Exploring Label Distribution Divergence for In-Context Demonstration Selection in Text Classification
Computation and Language
Picks best examples to teach computers faster.