MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance
By: Agam Goyal , Xianyang Zhan , Yilun Chen and more
Potential Business Impact:
Helps online sites remove bad posts better.
Large language models (LLMs) have shown great potential in flagging harmful content in online communities. Yet, existing approaches for moderation require a separate model for every community and are opaque in their decision-making, limiting real-world adoption. We introduce Mixture of Moderation Experts (MoMoE), a modular, cross-community framework that adds post-hoc explanations to scalable content moderation. MoMoE orchestrates four operators -- Allocate, Predict, Aggregate, Explain -- and is instantiated as seven community-specialized experts (MoMoE-Community) and five norm-violation experts (MoMoE-NormVio). On 30 unseen subreddits, the best variants obtain Micro-F1 scores of 0.72 and 0.67, respectively, matching or surpassing strong fine-tuned baselines while consistently producing concise and reliable explanations. Although community-specialized experts deliver the highest peak accuracy, norm-violation experts provide steadier performance across domains. These findings show that MoMoE yields scalable, transparent moderation without needing per-community fine-tuning. More broadly, they suggest that lightweight, explainable expert ensembles can guide future NLP and HCI research on trustworthy human-AI governance of online communities.
Similar Papers
MoMoE: A Mixture of Expert Agent Model for Financial Sentiment Analysis
Computational Engineering, Finance, and Science
Makes AI smarter by letting many AI parts work together.
Mixture of Experts in Large Language Models
Machine Learning (CS)
Makes smart computer programs learn faster and better.
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Machine Learning (CS)
Makes smart computer programs use less power.