Score: 1

VEHME: A Vision-Language Model For Evaluating Handwritten Mathematics Expressions

Published: October 26, 2025 | arXiv ID: 2510.22798v1

By: Thu Phuong Nguyen , Duc M. Nguyen , Hyotaek Jeon and more

Potential Business Impact:

Helps computers grade handwritten math homework.

Business Areas:

Image Recognition Data and Analytics, Software

Automatically assessing handwritten mathematical solutions is an important problem in educational technology with practical applications, but it remains a significant challenge due to the diverse formats, unstructured layouts, and symbolic complexity of student work. To address this challenge, we introduce VEHME-a Vision-Language Model for Evaluating Handwritten Mathematics Expressions-designed to assess open-form handwritten math responses with high accuracy and interpretable reasoning traces. VEHME integrates a two-phase training pipeline: (i) supervised fine-tuning using structured reasoning data, and (ii) reinforcement learning that aligns model outputs with multi-dimensional grading objectives, including correctness, reasoning depth, and error localization. To enhance spatial understanding, we propose an Expression-Aware Visual Prompting Module, trained on our synthesized multi-line math expressions dataset to robustly guide attention in visually heterogeneous inputs. Evaluated on AIHub and FERMAT datasets, VEHME achieves state-of-the-art performance among open-source models and approaches the accuracy of proprietary systems, demonstrating its potential as a scalable and accessible tool for automated math assessment. Our training and experiment code is publicly available at our GitHub repository.

Mask & Match: Learning to Recognize Handwritten Math with Self-Supervised Attention

CV and Pattern Recognition

Lets computers understand handwritten math problems.

8 Aug 2025 0

87%

Link prediction Graph Neural Networks for structure recognition of Handwritten Mathematical Expressions

CV and Pattern Recognition

Lets computers understand handwritten math problems.

4 Nov 2025 1

87%

MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning?

CV and Pattern Recognition

Tests if computers *really* see math problems.

28 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇰🇷 Korea, Republic of

Page Count

21 pages

VEHME: A Vision-Language Model For Evaluating Handwritten Mathematics Expressions

Helps computers grade handwritten math homework.

Technical Abstract

Mask & Match: Learning to Recognize Handwritten Math with Self-Supervised Attention

Link prediction Graph Neural Networks for structure recognition of Handwritten Mathematical Expressions

MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning?