Score: 0

GROKE: Vision-Free Navigation Instruction Evaluation via Graph Reasoning on OpenStreetMap

Published: January 12, 2026 | arXiv ID: 2601.07375v1

By: Farzad Shami , Subhrasankha Dey , Nico Van de Weghe and more

The evaluation of navigation instructions remains a persistent challenge in Vision-and-Language Navigation (VLN) research. Traditional reference-based metrics such as BLEU and ROUGE fail to capture the functional utility of spatial directives, specifically whether an instruction successfully guides a navigator to the intended destination. Although existing VLN agents could serve as evaluators, their reliance on high-fidelity visual simulators introduces licensing constraints and computational costs, and perception errors further confound linguistic quality assessment. This paper introduces GROKE(Graph-based Reasoning over OSM Knowledge for instruction Evaluation), a vision-free training-free hierarchical LLM-based framework for evaluating navigation instructions using OpenStreetMap data. Through systematic ablation studies, we demonstrate that structured JSON and textual formats for spatial information substantially outperform grid-based and visual graph representations. Our hierarchical architecture combines sub-instruction planning with topological graph navigation, reducing navigation error by 68.5% compared to heuristic and sampling baselines on the Map2Seq dataset. The agent's execution success, trajectory fidelity, and decision patterns serve as proxy metrics for functional navigability given OSM-visible landmarks and topology, establishing a scalable and interpretable evaluation paradigm without visual dependencies. Code and data are available at https://anonymous.4open.science/r/groke.

Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation

CV and Pattern Recognition

Helps robots follow directions better in new places.

14 Mar 2025 1

89%

GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation

Robotics

Helps robots follow directions in new places.

12 Sep 2025 0

88%

Breaking Down and Building Up: Mixture of Skill-Based Vision-and-Language Navigation Agents

Artificial Intelligence

Helps robots follow directions in new places.

11 Aug 2025 2

View PDF Login to Bookmark

GROKE: Vision-Free Navigation Instruction Evaluation via Graph Reasoning on OpenStreetMap

Technical Abstract

Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation

GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation

Breaking Down and Building Up: Mixture of Skill-Based Vision-and-Language Navigation Agents