Score: 0

PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction

Published: December 11, 2025 | arXiv ID: 2512.10888v1

By: Brandon Smock , Valerie Faucon-Morin , Max Sokolov and more

Potential Business Impact:

Helps computers find and understand tables in documents.

Business Areas:

Image Recognition Data and Analytics, Software

Table extraction (TE) is a key challenge in visual document understanding. Traditional approaches detect tables first, then recognize their structure. Recently, interest has surged in developing methods, such as vision-language models (VLMs), that can extract tables directly in their full page or document context. However, progress has been difficult to demonstrate due to a lack of annotated data. To address this, we create a new large-scale dataset, PubTables-v2. PubTables-v2 supports a number of current challenging table extraction tasks. Notably, it is the first large-scale benchmark for multi-page table structure recognition. We demonstrate its usefulness by evaluating domain-specialized VLMs on these tasks and highlighting current progress. Finally, we use PubTables-v2 to create the Page-Object Table Transformer (POTATR), an image-to-graph extension of the Table Transformer to comprehensive page-level TE. Data, code, and trained models will be released.

Benchmarking Table Extraction from Heterogeneous Scientific Extraction Documents

Databases

Helps computers understand tables in messy documents.

20 Nov 2025 1

88%

TABLET: A Large-Scale Dataset for Robust Visual Table Understanding

CV and Pattern Recognition

Helps computers understand real-world tables better.

25 Sep 2025 1

88%

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

CV and Pattern Recognition

Teaches computers to read tables without examples.

1 Dec 2025 2

View PDF Login to Bookmark

Page Count

15 pages

PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction

Helps computers find and understand tables in documents.

Technical Abstract

Benchmarking Table Extraction from Heterogeneous Scientific Extraction Documents

TABLET: A Large-Scale Dataset for Robust Visual Table Understanding

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition