Score: 1

Translate, then Detect: Leveraging Machine Translation for Cross-Lingual Toxicity Classification

Published: September 17, 2025 | arXiv ID: 2509.14493v1

By: Samuel J. Bell , Eduardo Sánchez , David Dale and more

BigTech Affiliations: Meta

Potential Business Impact:

Helps computers spot bad online talk in many languages.

Business Areas:

Translation Service Professional Services

Multilingual toxicity detection remains a significant challenge due to the scarcity of training data and resources for many languages. While prior work has leveraged the translate-test paradigm to support cross-lingual transfer across a range of classification tasks, the utility of translation in supporting toxicity detection at scale remains unclear. In this work, we conduct a comprehensive comparison of translation-based and language-specific/multilingual classification pipelines. We find that translation-based pipelines consistently outperform out-of-distribution classifiers in 81.3% of cases (13 of 16 languages), with translation benefits strongly correlated with both the resource level of the target language and the quality of the machine translation (MT) system. Our analysis reveals that traditional classifiers outperform large language model (LLM) judges, with this advantage being particularly pronounced for low-resource languages, where translate-classify methods dominate translate-judge approaches in 6 out of 7 cases. We additionally show that MT-specific fine-tuning on LLMs yields lower refusal rates compared to standard instruction-tuned models, but it can negatively impact toxicity detection accuracy for low-resource languages. These findings offer actionable guidance for practitioners developing scalable multilingual content moderation systems.

ylmmcl at Multilingual Text Detoxification 2025: Lexicon-Guided Detoxification and Classifier-Gated Rewriting

Computation and Language

Cleans up bad words in many languages.

24 Jul 2025 2

89%

Automatic Machine Translation Detection Using a Surrogate Multilingual Translation Model

Computation and Language

Finds fake translations to make language apps better.

4 Nov 2025 2

88%

Rethinking Toxicity Evaluation in Large Language Models: A Multi-Label Perspective

Computation and Language

Finds harmful text in computer writing better.

16 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

16 pages

Translate, then Detect: Leveraging Machine Translation for Cross-Lingual Toxicity Classification

Helps computers spot bad online talk in many languages.

Technical Abstract

ylmmcl at Multilingual Text Detoxification 2025: Lexicon-Guided Detoxification and Classifier-Gated Rewriting

Automatic Machine Translation Detection Using a Surrogate Multilingual Translation Model

Rethinking Toxicity Evaluation in Large Language Models: A Multi-Label Perspective