Score: 0

MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning

Published: March 26, 2025 | arXiv ID: 2503.20502v2

By: Yiwei Ma , Guohai Xu , Xiaoshuai Sun and more

Potential Business Impact:

Finds best examples to teach AI to follow instructions.

Business Areas:

Visual Search Internet Services

Visual instruction tuning (VIT) has emerged as a crucial technique for enabling multi-modal large language models (MLLMs) to follow user instructions adeptly. Yet, a significant gap persists in understanding the attributes of high-quality instruction tuning data and frameworks for its automated selection. To address this, we introduce MLLM-Selector, an automated approach that identifies valuable data for VIT by weighing necessity and diversity. Our process starts by randomly sampling a subset from the VIT data pool to fine-tune a pretrained model, thus creating a seed model with an initial ability to follow instructions. Then, leveraging the seed model, we calculate necessity scores for each sample in the VIT data pool to identify samples pivotal for enhancing model performance. Our findings underscore the importance of mixing necessity and diversity in data choice, leading to the creation of MLLM-Selector, our methodology that fuses necessity scoring with strategic sampling for superior data refinement. Empirical results indicate that within identical experimental conditions, MLLM-Selector surpasses LLaVA-1.5 in some benchmarks with less than 1% of the data and consistently exceeds performance across all validated benchmarks when using less than 50%.

D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning

Machine Learning (CS)

Teaches computers to follow instructions better with less data.

14 Mar 2025 0

88%

Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning

CV and Pattern Recognition

Helps AI learn better from pictures and words.

17 Mar 2025 2

88%

Importance-Aware Data Selection for Efficient LLM Instruction Tuning

Computation and Language

Finds best lessons to teach AI faster.

10 Nov 2025 2

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

25 pages

MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning

Finds best examples to teach AI to follow instructions.

Technical Abstract

D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning

Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning

Importance-Aware Data Selection for Efficient LLM Instruction Tuning