Score: 0

AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages

Published: October 20, 2025 | arXiv ID: 2510.17405v1

By: Mardiyyah Oduwole , Prince Mireku , Fatimo Adebanjo and more

Potential Business Impact:

Lets computers describe pictures in African languages.

Business Areas:

Image Recognition Data and Analytics, Software

Multimodal AI research has overwhelmingly focused on high-resource languages, hindering the democratization of advancements in the field. To address this, we present AfriCaption, a comprehensive framework for multilingual image captioning in 20 African languages and our contributions are threefold: (i) a curated dataset built on Flickr8k, featuring semantically aligned captions generated via a context-aware selection and translation process; (ii) a dynamic, context-preserving pipeline that ensures ongoing quality through model ensembling and adaptive substitution; and (iii) the AfriCaption model, a 0.5B parameter vision-to-text architecture that integrates SigLIP and NLLB200 for caption generation across under-represented languages. This unified framework ensures ongoing data quality and establishes the first scalable image-captioning resource for under-represented African languages, laying the groundwork for truly inclusive multimodal AI.

Multilingual Training-Free Remote Sensing Image Captioning

CV and Pattern Recognition

Lets computers describe satellite pictures in any language.

30 Nov 2025 2

89%

The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP

Computation and Language

Helps computers understand many African languages.

7 Oct 2025 0

88%

AfriMTEB and AfriE5: Benchmarking and Adapting Text Embedding Models for African Languages

Computation and Language

Helps computers understand African languages better.

27 Oct 2025 2

View PDF Login to Bookmark

Page Count

13 pages

AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages

Lets computers describe pictures in African languages.

Technical Abstract

Multilingual Training-Free Remote Sensing Image Captioning

The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP

AfriMTEB and AfriE5: Benchmarking and Adapting Text Embedding Models for African Languages