Score: 1

MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction

Published: December 16, 2025 | arXiv ID: 2512.14114v1

By: Rui-Yang Ju , KokSheik Wong , Yanlin Jin and more

Potential Business Impact:

Makes old, blurry documents clear for computers.

Business Areas:

Image Recognition Data and Analytics, Software

Document image enhancement and binarization are commonly performed prior to document analysis and recognition tasks for improving the efficiency and accuracy of optical character recognition (OCR) systems. This is because directly recognizing text in degraded documents, particularly in color images, often results in unsatisfactory recognition performance. To address these issues, existing methods train independent generative adversarial networks (GANs) for different color channels to remove shadows and noise, which, in turn, facilitates efficient text information extraction. However, deploying multiple GANs results in long training and inference times. To reduce both training and inference times of document image enhancement and binarization models, we propose MFE-GAN, an efficient GAN-based framework with multi-scale feature extraction (MFE), which incorporates Haar wavelet transformation (HWT) and normalization to process document images before feeding them into GANs for training. In addition, we present novel generators, discriminators, and loss functions to improve the model's performance, and we conduct ablation studies to demonstrate their effectiveness. Experimental results on the Benchmark, Nabuco, and CMATERdb datasets demonstrate that the proposed MFE-GAN significantly reduces the total training and inference times while maintaining comparable performance with respect to state-of-the-art (SOTA) methods. The implementation of this work is available at https://ruiyangju.github.io/MFE-GAN.

Multi-Scale Target-Aware Representation Learning for Fundus Image Enhancement

Image and Video Processing

Cleans up blurry eye pictures for better diagnosis.

3 May 2025 2

87%

Hierarchical Graph Feature Enhancement with Adaptive Frequency Modulation for Visual Recognition

CV and Pattern Recognition

Helps computers see objects better in pictures.

15 Aug 2025 0

87%

Filling the Gaps: A Multitask Hybrid Multiscale Generative Framework for Missing Modality in Remote Sensing Semantic Segmentation

CV and Pattern Recognition

Helps computers understand Earth pictures even when data is missing.

14 Sep 2025 2

View PDF Login to Bookmark

Country of Origin

🇦🇺 Australia

Page Count

16 pages

MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction

Makes old, blurry documents clear for computers.

Technical Abstract

Multi-Scale Target-Aware Representation Learning for Fundus Image Enhancement

Hierarchical Graph Feature Enhancement with Adaptive Frequency Modulation for Visual Recognition

Filling the Gaps: A Multitask Hybrid Multiscale Generative Framework for Missing Modality in Remote Sensing Semantic Segmentation