Adapting LLMs for Minimal-edit Grammatical Error Correction
By: Ryszard Staruch, Filip Graliński, Daniel Dzienisiewicz
Potential Business Impact:
Fixes English grammar with fewer changes.
Decoder-only large language models have shown superior performance in the fluency-edit English Grammatical Error Correction, but their adaptation for minimal-edit English GEC is still underexplored. To improve their effectiveness in the minimal-edit approach, we explore the error rate adaptation topic and propose a novel training schedule method. Our experiments set a new state-of-the-art result for a single-model system on the BEA-test set. We also detokenize the most common English GEC datasets to match the natural way of writing text. During the process, we find that there are errors in them. Our experiments analyze whether training on detokenized datasets impacts the results and measure the impact of the usage of the datasets with corrected erroneous examples. To facilitate reproducibility, we have released the source code used to train our models.
Similar Papers
Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction
Computation and Language
Fixes writing mistakes by understanding why they are wrong.
"When Data is Scarce, Prompt Smarter"... Approaches to Grammatical Error Correction in Low-Resource Settings
Computation and Language
Fixes grammar mistakes in many languages.
KoGEC : Korean Grammatical Error Correction with Pre-trained Translation Models
Computation and Language
Fixes Korean writing mistakes better than big AI.