Score: 3

DUAL: Diversity and Uncertainty Active Learning for Text Summarization

Published: March 2, 2025 | arXiv ID: 2503.00867v1

By: Petros Stylianos Giouroukis, Alexios Gidiotis, Grigorios Tsoumakas

Potential Business Impact:

Teaches computers to summarize text better with less data.

Business Areas:

A/B Testing Data and Analytics

With the rise of large language models, neural text summarization has advanced significantly in recent years. However, even state-of-the-art models continue to rely heavily on high-quality human-annotated data for training and evaluation. Active learning is frequently used as an effective way to collect such datasets, especially when annotation resources are scarce. Active learning methods typically prioritize either uncertainty or diversity but have shown limited effectiveness in summarization, often being outperformed by random sampling. We present Diversity and Uncertainty Active Learning (DUAL), a novel algorithm that combines uncertainty and diversity to iteratively select and annotate samples that are both representative of the data distribution and challenging for the current model. DUAL addresses the selection of noisy samples in uncertainty-based methods and the limited exploration scope of diversity-based methods. Through extensive experiments with different summarization models and benchmark datasets, we demonstrate that DUAL consistently matches or outperforms the best performing strategies. Using visualizations and quantitative metrics, we provide valuable insights into the effectiveness and robustness of different active learning strategies, in an attempt to understand why these strategies haven't performed consistently in text summarization. Finally, we show that DUAL strikes a good balance between diversity and robustness.

DUAL: Dynamic Uncertainty-Aware Learning

Machine Learning (CS)

Helps computers learn better with messy information.

21 May 2025 0

90%

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

CV and Pattern Recognition

Teaches robots to see better indoors with less data.

20 Mar 2025 0

87%

Active Learning Methods for Efficient Data Utilization and Model Performance Enhancement

Machine Learning (CS)

Teaches computers to learn with less examples.

21 Apr 2025 1

View PDF Login to Bookmark

Country of Origin

🇬🇷 Greece

Repos / Data Links

github.com github.com github.com github.com github.com huggingface.co huggingface.co huggingface.co huggingface.co

Page Count

23 pages

DUAL: Diversity and Uncertainty Active Learning for Text Summarization

Teaches computers to summarize text better with less data.

Technical Abstract

DUAL: Dynamic Uncertainty-Aware Learning

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

Active Learning Methods for Efficient Data Utilization and Model Performance Enhancement