Score: 0

System Report for CCL25-Eval Task 10: Prompt-Driven Large Language Model Merge for Fine-Grained Chinese Hate Speech Detection

Published: December 10, 2025 | arXiv ID: 2512.09563v1

By: Binglin Wu, Jiaxiu Zou, Xianneng Li

Potential Business Impact:

Finds hidden hate speech online.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The proliferation of hate speech on Chinese social media poses urgent societal risks, yet traditional systems struggle to decode context-dependent rhetorical strategies and evolving slang. To bridge this gap, we propose a novel three-stage LLM-based framework: Prompt Engineering, Supervised Fine-tuning, and LLM Merging. First, context-aware prompts are designed to guide LLMs in extracting implicit hate patterns. Next, task-specific features are integrated during supervised fine-tuning to enhance domain adaptation. Finally, merging fine-tuned LLMs improves robustness against out-of-distribution cases. Evaluations on the STATE-ToxiCN benchmark validate the framework's effectiveness, demonstrating superior performance over baseline methods in detecting fine-grained hate speech.

Country of Origin
🇨🇳 China

Page Count
8 pages

Category
Computer Science:
Computation and Language