GeoArena: An Open Platform for Benchmarking Large Vision-language Models on WorldWide Image Geolocalization
By: Pengyue Jia , Yingyi Zhang , Xiangyu Zhao and more
Potential Business Impact:
Lets computers guess where photos were taken.
Image geolocalization aims to predict the geographic location of images captured anywhere on Earth, but its global nature presents significant challenges. Current evaluation methodologies suffer from two major limitations. First, data leakage: advanced approaches often rely on large vision-language models (LVLMs) to predict image locations, yet these models are frequently pretrained on the test datasets, compromising the accuracy of evaluating a model's actual geolocalization capability. Second, existing metrics primarily rely on exact geographic coordinates to assess predictions, which not only neglects the reasoning process but also raises privacy concerns when user-level location data is required. To address these issues, we propose GeoArena, a first open platform for evaluating LVLMs on worldwide image geolocalization tasks, offering true in-the-wild and human-centered benchmarking. GeoArena enables users to upload in-the-wild images for a more diverse evaluation corpus, and it leverages pairwise human judgments to determine which model output better aligns with human expectations. Our platform has been deployed online for two months, during which we collected over thousands voting records. Based on this data, we conduct a detailed analysis and establish a leaderboard of different LVLMs on the image geolocalization task.
Similar Papers
GeoArena: An Open Platform for Benchmarking Large Vision-language Models on WorldWide Image Geolocalization
CV and Pattern Recognition
Helps computers guess where pictures were taken.
From Pixels to Places: A Systematic Benchmark for Evaluating Image Geolocalization Ability in Large Language Models
CV and Pattern Recognition
Tests AI on guessing photo locations
Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models
CV and Pattern Recognition
Helps computers find places from any picture.