Score: 1

PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language

Published: May 23, 2025 | arXiv ID: 2505.18331v1

By: Naghmeh Jamali , Milad Mohammadi , Danial Baledi and more

Potential Business Impact:

Helps computers answer health questions in Persian.

Business Areas:

Q&A Community and Lifestyle

Medical consumer question answering (CQA) is crucial for empowering patients by providing personalized and reliable health information. Despite recent advances in large language models (LLMs) for medical QA, consumer-oriented and multilingual resources, particularly in low-resource languages like Persian, remain sparse. To bridge this gap, we present PerMedCQA, the first Persian-language benchmark for evaluating LLMs on real-world, consumer-generated medical questions. Curated from a large medical QA forum, PerMedCQA contains 68,138 question-answer pairs, refined through careful data cleaning from an initial set of 87,780 raw entries. We evaluate several state-of-the-art multilingual and instruction-tuned LLMs, utilizing MedJudge, a novel rubric-based evaluation framework driven by an LLM grader, validated against expert human annotators. Our results highlight key challenges in multilingual medical QA and provide valuable insights for developing more accurate and context-aware medical assistance systems. The data is publicly available on https://huggingface.co/datasets/NaghmehAI/PerMedCQA

PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark

Computation and Language

Helps doctors answer medical questions in Persian.

30 May 2025 1

90%

MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks

Computation and Language

Helps AI understand Arabic medical questions.

6 May 2025 2

90%

PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation

Computation and Language

Helps doctors in Peru pass medical tests.

15 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇮🇷 Iran

Page Count

20 pages

PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language

Helps computers answer health questions in Persian.

Technical Abstract

PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark

MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks

PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation