Score: 1

Bridging Language Gaps in Open-Source Documentation with Large-Language-Model Translation

Published: August 4, 2025 | arXiv ID: 2508.02497v1

By: Elijah Kayode Adejumo, Brittany Johnson, Mariam Guizani

Potential Business Impact:

Helps translate computer code instructions for everyone.

While open source communities attract diverse contributors globally, few repositories provide essential documentation in languages other than English. Large language models (LLMs) have demonstrated remarkable capabilities in software engineering tasks and translations across domains. However, little is known about LLM capabilities in translating open-source technical documentation, which mixes natural language, code, URLs, and markdown formatting. To understand the need and potential for LLMs in technical documentation translation, we evaluated community translation activity and English-to-German translations of 50 README files using OpenAI's ChatGPT 4 and Anthropic's Claude. We found scarce translation activity, mostly in larger repositories and community-driven in nature. LLM performance comparison suggests they can provide accurate translations. However, analysis revealed fidelity challenges: both models struggled to preserve structural components (e.g., hyperlinks) and exhibited formatting inconsistencies. These findings highlight both promise and challenges of LLM-assisted documentation internationalization. As a first step toward translation-aware continuous integration pipelines, we introduce TRIFID, an early-stage translation fidelity scoring framework that automatically checks how well translations preserve code, links, and formatting. Our efforts provide a foundation for automated LLM-driven support for creating and maintaining open source documentation.

From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence

Software Engineering

Helps computers write computer programs from words.

23 Nov 2025 2

91%

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Software Engineering

Helps computers write computer programs from words.

23 Nov 2025 2

91%

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Software Engineering

Makes computers write computer programs from your words.

23 Nov 2025 2

View PDF Login to Bookmark

Country of Origin

🇺🇸 🇨🇦 Canada, United States

Page Count

6 pages

Bridging Language Gaps in Open-Source Documentation with Large-Language-Model Translation

Helps translate computer code instructions for everyone.

Technical Abstract

From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence