Score: 1

Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs

Published: April 28, 2025 | arXiv ID: 2504.19675v2

By: Osma Suominen, Juho Inkinen, Mona Lehtinen

Potential Business Impact:

Helps libraries automatically sort books by topic.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This paper presents the Annif system in SemEval-2025 Task 5 (LLMs4Subjects), which focussed on subject indexing using large language models (LLMs). The task required creating subject predictions for bibliographic records from the bilingual TIBKAT database using the GND subject vocabulary. Our approach combines traditional natural language processing and machine learning techniques implemented in the Annif toolkit with innovative LLM-based methods for translation and synthetic data generation, and merging predictions from monolingual models. The system ranked first in the all-subjects category and second in the tib-core-subjects category in the quantitative evaluation, and fourth in qualitative evaluations. These findings demonstrate the potential of combining traditional XMTC algorithms with modern LLM techniques to improve the accuracy and efficiency of subject indexing in multilingual contexts.

Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs

Computation and Language

Helps libraries find books faster using smart computers.

21 Aug 2025 1

89%

DNB-AI-Project at SemEval-2025 Task 5: An LLM-Ensemble Approach for Automated Subject Indexing

Computation and Language

Tags library books automatically for better searching.

30 Apr 2025 1

89%

SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog

Computation and Language

Helps libraries sort science papers automatically.

9 Apr 2025 1

View PDF Login to Bookmark

Country of Origin

🇫🇮 Finland

Repos / Data Links

github.com github.com github.com github.com

Page Count

8 pages

Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs

Helps libraries automatically sort books by topic.

Technical Abstract

Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs

DNB-AI-Project at SemEval-2025 Task 5: An LLM-Ensemble Approach for Automated Subject Indexing

SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog