Score: 0

Nationality and Region Prediction from Names: A Comparative Study of Neural Models and Large Language Models

Published: January 13, 2026 | arXiv ID: 2601.08692v1

By: Keito Inoshita

Predicting nationality from personal names has practical value in marketing, demographic research, and genealogical studies. Conventional neural models learn statistical correspondences between names and nationalities from task-specific training data, posing challenges in generalizing to low-frequency nationalities and distinguishing similar nationalities within the same region. Large language models (LLMs) have the potential to address these challenges by leveraging world knowledge acquired during pre-training. In this study, we comprehensively compare neural models and LLMs on nationality prediction, evaluating six neural models and six LLM prompting strategies across three granularity levels (nationality, region, and continent), with frequency-based stratified analysis and error analysis. Results show that LLMs outperform neural models at all granularity levels, with the gap narrowing as granularity becomes coarser. Simple machine learning methods exhibit the highest frequency robustness, while pre-trained models and LLMs show degradation for low-frequency nationalities. Error analysis reveals that LLMs tend to make ``near-miss'' errors, predicting the correct region even when nationality is incorrect, whereas neural models exhibit more cross-regional errors and bias toward high-frequency classes. These findings indicate that LLM superiority stems from world knowledge, model selection should consider required granularity, and evaluation should account for error quality beyond accuracy.

Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks

Computation and Language

AI models show unfair bias based on names.

22 Jul 2025 0

89%

Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language Models

Computation and Language

Fixes computer suggestions to be fairer to all places.

16 Mar 2025 1

88%

Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions

Computation and Language

Computers don't understand people's backgrounds well.

28 Feb 2025 1

View PDF Login to Bookmark

Nationality and Region Prediction from Names: A Comparative Study of Neural Models and Large Language Models

Technical Abstract

Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks

Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language Models

Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions