Score: 0

Bootstrapping Code Translation with Weighted Multilanguage Exploration

Published: January 7, 2026 | arXiv ID: 2601.03512v1

By: Yuhan Wu , Huan Zhang , Wei Cheng and more

Potential Business Impact:

Translates computer code between languages automatically.

Business Areas:
Translation Service Professional Services

Code translation across multiple programming languages is essential yet challenging due to two vital obstacles: scarcity of parallel data paired with executable test oracles, and optimization imbalance when handling diverse language pairs. We propose BootTrans, a bootstrapping method that resolves both obstacles. Its key idea is to leverage the functional invariance and cross-lingual portability of test suites, adapting abundant pivot-language unit tests to serve as universal verification oracles for multilingual RL training. Our method introduces a dual-pool architecture with seed and exploration pools to progressively expand training data via execution-guided experience collection. Furthermore, we design a language-aware weighting mechanism that dynamically prioritizes harder translation directions based on relative performance across sibling languages, mitigating optimization imbalance. Extensive experiments on the HumanEval-X and TransCoder-Test benchmarks demonstrate substantial improvements over baseline LLMs across all translation directions, with ablations validating the effectiveness of both bootstrapping and weighting components.

Country of Origin
🇨🇳 China

Page Count
12 pages

Category
Computer Science:
Software Engineering