Unifying points of interest taxonomies: mapping OpenStreetMap tags to the Foursquare category system
By: Lilou Soulas , Lorenzo Lucchini , Maurizio Napolitano and more
Potential Business Impact:
Connects different map data for better city apps.
The heterogeneity of Point of Interest (POI) taxonomies is a persistent challenge for the integration of urban datasets and the development of location-based services. OpenStreetMap (OSM) adopts a flexible, community-driven tagging system, while Foursquare (FS) relies on a curated hierarchical structure. Here we present an openly available benchmark and mapping framework that aligns OSM tags with the FS taxonomy. This resource integrates the richness of community-driven OSM data with the hierarchical structure of FS, enabling reproducible and interoperable urban analytics. The dataset is complemented by an evaluation of embedding and LLM-based alignment strategies and a pipeline that supports scalable updates as OSM evolves. Together, these elements provide both a robust reference resource and a practical tool for the community. Our approach is structured around three components: the construction of a manually curated benchmark as a gold standard, the evaluation of pretrained text embedding models for semantic alignment between OSM tags and FS categories, and an LLM-based refinement stage that enhances robustness and adaptability. The proposed methodology provides a scalable and reproducible solution for taxonomy unification, with direct applications to urban analytics, mobility studies, and smart city services.
Similar Papers
World-POI: Global Point-of-Interest Data Enriched from Foursquare and OpenStreetMap as Tabular and Graph Data
Databases
Combines two maps to show real businesses better.
SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists
Information Retrieval
Lets reporters find places using normal words.
OsmT: Bridging OpenStreetMap Queries and Natural Language with Open-source Tag-aware Language Models
Computation and Language
Lets computers understand map questions easily.