AI based signage classification for linguistic landscape studies
By: Yuqin Jiang , Song Jiang , Jacob Algrim and more
Potential Business Impact:
AI helps map city languages faster.
Linguistic Landscape (LL) research traditionally relies on manual photography and annotation of public signages to examine distribution of languages in urban space. While such methods yield valuable findings, the process is time-consuming and difficult for large study areas. This study explores the use of AI powered language detection method to automate LL analysis. Using Honolulu Chinatown as a case study, we constructed a georeferenced photo dataset of 1,449 images collected by researchers and applied AI for optical character recognition (OCR) and language classification. We also conducted manual validations for accuracy checking. This model achieved an overall accuracy of 79%. Five recurring types of mislabeling were identified, including distortion, reflection, degraded surface, graffiti, and hallucination. The analysis also reveals that the AI model treats all regions of an image equally, detecting peripheral or background texts that human interpreters typically ignore. Despite these limitations, the results demonstrate the potential of integrating AI-assisted workflows into LL research to reduce such time-consuming processes. However, due to all the limitations and mis-labels, we recognize that AI cannot be fully trusted during this process. This paper encourages a hybrid approach combining AI automation with human validation for a more reliable and efficient workflow.
Similar Papers
Human + AI for Accelerating Ad Localization Evaluation
Artificial Intelligence
Makes ads look good in any language.
Bridging Psychometric and Content Development Practices with AI: A Community-Based Workflow for Augmenting Hawaiian Language Assessments
Human-Computer Interaction
Helps check school tests in Hawaiian language.
Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI
Computation and Language
Helps tell if writing is human or AI.