Score: 0

BengaliFig: A Low-Resource Challenge for Figurative and Culturally Grounded Reasoning in Bengali

Published: November 25, 2025 | arXiv ID: 2511.20399v1

By: Abdullah Al Sefat

Potential Business Impact:

Teaches computers to understand Bengali riddles.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models excel on broad multilingual benchmarks but remain to be evaluated extensively in figurative and culturally grounded reasoning, especially in low-resource contexts. We present BengaliFig, a compact yet richly annotated challenge set that targets this gap in Bengali, a widely spoken low-resourced language. The dataset contains 435 unique riddles drawn from Bengali oral and literary traditions. Each item is annotated along five orthogonal dimensions capturing reasoning type, trap type, cultural depth, answer category, and difficulty, and is automatically converted to multiple-choice format through a constraint-aware, AI-assisted pipeline. We evaluate eight frontier LLMs from major providers under zero-shot and few-shot chain-of-thought prompting, revealing consistent weaknesses in metaphorical and culturally specific reasoning. BengaliFig thus contributes both a diagnostic probe for evaluating LLM robustness in low-resource cultural contexts and a step toward inclusive and heritage-aware NLP evaluation.

Page Count
20 pages

Category
Computer Science:
Computation and Language