ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement
By: Zhihang Liu , Xiaoyi Bao , Pandeng Li and more
Potential Business Impact:
Makes charts from data tables automatically.
While existing generation and unified models excel at general image generation, they struggle with tasks requiring deep reasoning, planning, and precise data-to-visual mapping abilities beyond general scenarios. To push beyond the existing limitations, we introduce a new and challenging task: creative table visualization, requiring the model to generate an infographic that faithfully and aesthetically visualizes the data from a given table. To address this challenge, we propose ShowTable, a pipeline that synergizes MLLMs with diffusion models via a progressive self-correcting process. The MLLM acts as the central orchestrator for reasoning the visual plan and judging visual errors to provide refined instructions, the diffusion execute the commands from MLLM, achieving high-fidelity results. To support this task and our pipeline, we introduce three automated data construction pipelines for training different modules. Furthermore, we introduce TableVisBench, a new benchmark with 800 challenging instances across 5 evaluation dimensions, to assess performance on this task. Experiments demonstrate that our pipeline, instantiated with different models, significantly outperforms baselines, highlighting its effective multi-modal reasoning, generation, and error correction capabilities.
Similar Papers
Visual-TableQA: Open-Domain Benchmark for Reasoning over Table Images
CV and Pattern Recognition
Helps computers understand information in tables.
TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
CV and Pattern Recognition
Teaches computers to read tables without examples.
TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing
CV and Pattern Recognition
Helps computers understand charts and tables faster.