WWW2023

Geographic Information Retrieval Using Wikipedia Articles

Amir Krause, Sara Cohen

被引用 6 次

摘要

Assigning semantically relevant, real-world locations to documents opens new possibilities to perform geographic information retrieval. We propose a novel approach to automatically determine the latitude-longitude coordinates of appropriate Wikipedia articles with high accuracy, leveraging both text and metadata in the corpus. By examining articles whose base-truth coordinates are known, we show that our method attains a substantial improvement over state of the art works. We subsequently demonstrate how our approach could yield two benefits: (1) detecting significant geolocation errors in Wikipedia; and (2) proposing approximated coordinates for hundreds of thousands of articles which are not traditionally considered to be locations (such as events, ideas or people), opening new possibilities for conceptual geographic retrievals over Wikipedia.