Mixture of Small and Large Models for Chinese Spelling Check
By: Ziheng Qiao, Houquan Zhou, Zhenghua Li
Potential Business Impact:
Fixes spelling mistakes better than before.
In the era of large language models (LLMs), the Chinese Spelling Check (CSC) task has seen various LLM methods developed, yet their performance remains unsatisfactory. In contrast, fine-tuned BERT-based models, relying on high-quality in-domain data, show excellent performance but suffer from edit pattern overfitting. This paper proposes a novel dynamic mixture approach that effectively combines the probability distributions of small models and LLMs during the beam search decoding phase, achieving a balanced enhancement of precise corrections from small models and the fluency of LLMs. This approach also eliminates the need for fine-tuning LLMs, saving significant time and resources, and facilitating domain adaptation. Comprehensive experiments demonstrate that our mixture approach significantly boosts error correction capabilities, achieving state-of-the-art results across multiple datasets. Our code is available at https://github.com/zhqiao-nlp/MSLLM.
Similar Papers
CEC-Zero: Chinese Error Correction Solution Based on LLM
Computation and Language
Teaches computers to fix Chinese text errors alone.
Unveiling the Impact of Multimodal Features on Chinese Spelling Correction: From Analysis to Design
Computation and Language
Fixes typing mistakes in Chinese text better.
A Training-free LLM-based Approach to General Chinese Character Error Correction
Computation and Language
Fixes all Chinese typing mistakes, even missing ones.