ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing
By: Xuanle Zhao , Xuexin Liu , Haoyue Yang and more
Potential Business Impact:
Computers can now edit charts, but not perfectly.
Although multimodal large language models (MLLMs) show promise in generating chart rendering code, editing charts via code presents a greater challenge. This task demands MLLMs to integrate chart understanding and reasoning capacities, which are labor-intensive. While many MLLMs claim such editing capabilities, current evaluations rely on limited case studies, highlighting the urgent need for a comprehensive evaluation framework. In this work, we propose \textsc{ChartEdit}, a novel benchmark designed for chart editing tasks, featuring $1405$ diverse editing instructions applied to $233$ real-world charts, each manually annotated and validated for accuracy. Utilizing \textsc{ChartEdit}, we evaluate the performance of 10 mainstream MLLMs across two types of experiments at both the code and chart levels. The results suggest that large-scale models can generate code to produce images that partially match the reference images. However, their ability to generate accurate edits according to the instructions remains limited. The state-of-the-art (SOTA) model achieves a score of only $59.96$, highlighting significant challenges in precise modification. In contrast, small-scale models, including chart-domain models, struggle both with following editing instructions and generating overall chart images, underscoring the need for further development in this area. Code is available at https://github.com/xxlllz/ChartEdit.
Similar Papers
EDIT-Bench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits
Software Engineering
Tests AI that fixes computer code from instructions.
From Charts to Code: A Hierarchical Benchmark for Multimodal Models
Software Engineering
Helps computers make charts from data.
ChartEditor: A Reinforcement Learning Framework for Robust Chart Editing
Multimedia
Helps computers fix charts from pictures.