MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
By: Chaewon Kim, Seoyeon Lee, Jonghyuk Park
Document shadow removal is essential for enhancing the clarity of digitized documents. Preserving high-frequency details (e.g., text edges and lines) is critical in this process because shadows often obscure or distort fine structures. This paper proposes a matte vision transformer (MatteViT), a novel shadow removal framework that applies spatial and frequency-domain information to eliminate shadows while preserving fine-grained structural details. To effectively retain these details, we employ two preservation strategies. First, our method introduces a lightweight high-frequency amplification module (HFAM) that decomposes and adaptively amplifies high-frequency components. Second, we present a continuous luminance-based shadow matte, generated using a custom-built matte dataset and shadow matte generator, which provides precise spatial guidance from the earliest processing stage. These strategies enable the model to accurately identify fine-grained regions and restore them with high fidelity. Extensive experiments on public benchmarks (RDD and Kligler) demonstrate that MatteViT achieves state-of-the-art performance, providing a robust and practical solution for real-world document shadow removal. Furthermore, the proposed method better preserves text-level details in downstream tasks, such as optical character recognition, improving recognition performance over prior methods.
Similar Papers
Leveraging Contrast Information for Efficient Document Shadow Removal
CV and Pattern Recognition
Clears shadows from scanned pages automatically.
Edge-Enhanced Vision Transformer Framework for Accurate AI-Generated Image Detection
CV and Pattern Recognition
Finds fake pictures made by computers.
Hybrid CNN-ViT Framework for Motion-Blurred Scene Text Restoration
CV and Pattern Recognition
Clears blurry text in pictures for computers.