Score: 0

Evaluating OpenAI GPT Models for Translation of Endangered Uralic Languages: A Comparison of Reasoning and Non-Reasoning Architectures

Published: December 18, 2025 | arXiv ID: 2512.16287v1

By: Yehor Tereshchenko, Mika Hämäläinen, Svitlana Myroniuk

Potential Business Impact:

Helps save rare languages with smarter AI.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The evaluation of Large Language Models (LLMs) for translation tasks has primarily focused on high-resource languages, leaving a significant gap in understanding their performance on low-resource and endangered languages. This study presents a comprehensive comparison of OpenAI's GPT models, specifically examining the differences between reasoning and non-reasoning architectures for translating between Finnish and four low-resource Uralic languages: Komi-Zyrian, Moksha, Erzya, and Udmurt. Using a parallel corpus of literary texts, we evaluate model willingness to attempt translation through refusal rate analysis across different model architectures. Our findings reveal significant performance variations between reasoning and non-reasoning models, with reasoning models showing 16 percentage points lower refusal rates. The results provide valuable insights for researchers and practitioners working with Uralic languages and contribute to the broader understanding of reasoning model capabilities for endangered language preservation.

Reasoning or Overthinking: Evaluating Large Language Models on Financial Sentiment Analysis

Computation and Language

Computers understand money news like people.

5 Jun 2025 1

88%

Human-Level Reasoning: A Comparative Study of Large Language Models on Logical and Abstract Reasoning

Artificial Intelligence

Tests if AI can think like a person.

28 Oct 2025 0

88%

DeepSeek-R1 vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?

Computation and Language

Helps computers judge writing quality better.

10 Apr 2025 2

View PDF Login to Bookmark

Country of Origin

🇫🇮 Finland

Page Count

9 pages

Evaluating OpenAI GPT Models for Translation of Endangered Uralic Languages: A Comparison of Reasoning and Non-Reasoning Architectures

Helps save rare languages with smarter AI.

Technical Abstract

Reasoning or Overthinking: Evaluating Large Language Models on Financial Sentiment Analysis

Human-Level Reasoning: A Comparative Study of Large Language Models on Logical and Abstract Reasoning

DeepSeek-R1 vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?