Score: 0

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems

Published: March 13, 2025 | arXiv ID: 2503.10627v1

By: Ziyu Guo , Ray Zhang , Hao Chen and more

Potential Business Impact:

Tests if AI understands science like a student.

Business Areas:

Semantic Web Internet Services

The rapid advancement of Large Multi-modal Models (LMMs) has enabled their application in scientific problem-solving, yet their fine-grained capabilities remain under-explored. In this paper, we introduce SciVerse, a multi-modal scientific evaluation benchmark to thoroughly assess LMMs across 5,735 test instances in five distinct versions. We aim to investigate three key dimensions of LMMs: scientific knowledge comprehension, multi-modal content interpretation, and Chain-of-Thought (CoT) reasoning. To unveil whether LMMs possess sufficient scientific expertise, we first transform each problem into three versions containing different levels of knowledge required for solving, i.e., Knowledge-free, -lite, and -rich. Then, to explore how LMMs interpret multi-modal scientific content, we annotate another two versions, i.e., Vision-rich and -only, marking more question information from texts to diagrams. Comparing the results of different versions, SciVerse systematically examines the professional knowledge stock and visual perception skills of LMMs in scientific domains. In addition, to rigorously assess CoT reasoning, we propose a new scientific CoT evaluation strategy, conducting a step-wise assessment on knowledge and logical errors in model outputs. Our extensive evaluation of different LMMs on SciVerse reveals critical limitations in their scientific proficiency and provides new insights into future developments. Project page: https://sciverse-cuhk.github.io

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

CV and Pattern Recognition

Tests AI's ability to understand science videos.

9 Oct 2025 1

90%

MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems

Machine Learning (CS)

Tests if computers can solve science problems.

27 Feb 2025 3

89%

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification

Computation and Language

Helps computers understand science papers better.

18 Jun 2025 1

View PDF Login to Bookmark

Page Count

21 pages

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems

Tests if AI understands science like a student.

Technical Abstract

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification