Score: 0

DomainCQA: Crafting Knowledge-Intensive QA from Domain-Specific Charts

Published: March 25, 2025 | arXiv ID: 2503.19498v4

By: Yujing Lu , Ling Zhong , Jing Yang and more

Potential Business Impact:

Teaches computers to understand complex charts better.

Business Areas:
Text Analytics Data and Analytics, Software

Chart Question Answering (CQA) evaluates Multimodal Large Language Models (MLLMs) on visual understanding and reasoning over chart data. However, existing benchmarks mostly test surface-level parsing, such as reading labels and legends, while overlooking deeper scientific reasoning. We propose DomainCQA, a framework for constructing domain-specific CQA benchmarks that emphasize both visual comprehension and knowledge-intensive reasoning. It integrates complexity-aware chart selection, multitier QA generation, and expert validation. Applied to astronomy, DomainCQA yields AstroChart, a benchmark of 1,690 QA pairs over 482 charts, exposing persistent weaknesses in fine-grained perception, numerical reasoning, and domain knowledge integration across 21 MLLMs. Fine-tuning on AstroChart improves performance across fundamental and advanced tasks. Pilot QA sets in biochemistry, economics, medicine, and social science further demonstrate DomainCQA's generality. Together, our results establish DomainCQA as a unified pipeline for constructing and augmenting domain-specific chart reasoning benchmarks.

Page Count
85 pages

Category
Computer Science:
Computation and Language