ReasonTabQA: A Comprehensive Benchmark for Table Question Answering from Real World Industrial Scenarios
By: Changzai Pan , Jie Zhang , Kaiwen Wei and more
Recent advancements in Large Language Models (LLMs) have significantly catalyzed table-based question answering (TableQA). However, existing TableQA benchmarks often overlook the intricacies of industrial scenarios, which are characterized by multi-table structures, nested headers, and massive scales. These environments demand robust table reasoning through deep structured inference, presenting a significant challenge that remains inadequately addressed by current methodologies. To bridge this gap, we present ReasonTabQA, a large-scale bilingual benchmark encompassing 1,932 tables across 30 industry domains such as energy and automotive. ReasonTabQA provides high-quality annotations for both final answers and explicit reasoning chains, supporting both thinking and no-thinking paradigms. Furthermore, we introduce TabCodeRL, a reinforcement learning method that leverages table-aware verifiable rewards to guide the generation of logical reasoning paths. Extensive experiments on ReasonTabQA and 4 TableQA datasets demonstrate that while TabCodeRL yields substantial performance gains on open-source LLMs, the persistent performance gap on ReasonTabQA underscores the inherent complexity of real-world industrial TableQA.
Similar Papers
TableReasoner: Advancing Table Reasoning Framework with Large Language Models
Artificial Intelligence
Answers questions from messy computer tables.
Visual-TableQA: Open-Domain Benchmark for Reasoning over Table Images
CV and Pattern Recognition
Helps computers understand information in tables.
Table as a Modality for Large Language Models
Computation and Language
Helps computers understand charts and tables better.