Score: 2

GeoArena: An Open Platform for Benchmarking Large Vision-language Models on WorldWide Image Geolocalization

Published: September 4, 2025 | arXiv ID: 2509.04334v2

By: Pengyue Jia , Yingyi Zhang , Xiangyu Zhao and more

Potential Business Impact:

Lets computers guess where photos were taken.

Business Areas:

Geospatial Data and Analytics, Navigation and Mapping

Image geolocalization aims to predict the geographic location of images captured anywhere on Earth, but its global nature presents significant challenges. Current evaluation methodologies suffer from two major limitations. First, data leakage: advanced approaches often rely on large vision-language models (LVLMs) to predict image locations, yet these models are frequently pretrained on the test datasets, compromising the accuracy of evaluating a model's actual geolocalization capability. Second, existing metrics primarily rely on exact geographic coordinates to assess predictions, which not only neglects the reasoning process but also raises privacy concerns when user-level location data is required. To address these issues, we propose GeoArena, a first open platform for evaluating LVLMs on worldwide image geolocalization tasks, offering true in-the-wild and human-centered benchmarking. GeoArena enables users to upload in-the-wild images for a more diverse evaluation corpus, and it leverages pairwise human judgments to determine which model output better aligns with human expectations. Our platform has been deployed online for two months, during which we collected over thousands voting records. Based on this data, we conduct a detailed analysis and establish a leaderboard of different LVLMs on the image geolocalization task.

GeoArena: An Open Platform for Benchmarking Large Vision-language Models on WorldWide Image Geolocalization

CV and Pattern Recognition

Helps computers guess where pictures were taken.

4 Sep 2025 1

90%

From Pixels to Places: A Systematic Benchmark for Evaluating Image Geolocalization Ability in Large Language Models

CV and Pattern Recognition

Tests AI on guessing photo locations

3 Aug 2025 2

89%

Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models

CV and Pattern Recognition

Helps computers find places from any picture.

17 Jun 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 🇭🇰 United States, Hong Kong

Repos / Data Links

github.com

Page Count

16 pages

GeoArena: An Open Platform for Benchmarking Large Vision-language Models on WorldWide Image Geolocalization

Lets computers guess where photos were taken.

Technical Abstract

GeoArena: An Open Platform for Benchmarking Large Vision-language Models on WorldWide Image Geolocalization

From Pixels to Places: A Systematic Benchmark for Evaluating Image Geolocalization Ability in Large Language Models

Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models