Score: 1

Benchmarking Multimodal Large Language Models for Face Recognition

Published: October 16, 2025 | arXiv ID: 2510.14866v1

By: Hatef Otroshi Shahreza, Sébastien Marcel

Potential Business Impact:

Tests how computers recognize faces better.

Business Areas:

Facial Recognition Data and Analytics, Software

Multimodal large language models (MLLMs) have achieved remarkable performance across diverse vision-and-language tasks. However, their potential in face recognition remains underexplored. In particular, the performance of open-source MLLMs needs to be evaluated and compared with existing face recognition models on standard benchmarks with similar protocol. In this work, we present a systematic benchmark of state-of-the-art MLLMs for face recognition on several face recognition datasets, including LFW, CALFW, CPLFW, CFP, AgeDB and RFW. Experimental results reveal that while MLLMs capture rich semantic cues useful for face-related tasks, they lag behind specialized models in high-precision recognition scenarios in zero-shot applications. This benchmark provides a foundation for advancing MLLM-based face recognition, offering insights for the design of next-generation models with higher accuracy and generalization. The source code of our benchmark is publicly available in the project page.

CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model

Computation and Language

Helps computers understand money charts and numbers.

16 Jun 2025 0

89%

Towards Fine-Grained Recognition with Large Visual Language Models: Benchmark and Optimization Strategies

CV and Pattern Recognition

Helps AI understand pictures and details better.

11 Dec 2025 1

89%

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Computation and Language

Helps computers understand how people *really* talk.

23 Apr 2025 1

View PDF Login to Bookmark

Page Count

5 pages

Benchmarking Multimodal Large Language Models for Face Recognition

Tests how computers recognize faces better.

Technical Abstract

CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model

Towards Fine-Grained Recognition with Large Visual Language Models: Benchmark and Optimization Strategies

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark