Score: 1

MaskAnyNet: Rethinking Masked Image Regions as Valuable Information in Supervised Learning

Published: November 16, 2025 | arXiv ID: 2511.12480v1

By: Jingshan Hong , Haigen Hu , Huihuang Zhang and more

Potential Business Impact:

Lets computers learn more from pictures.

Business Areas:
Image Recognition Data and Analytics, Software

In supervised learning, traditional image masking faces two key issues: (i) discarded pixels are underutilized, leading to a loss of valuable contextual information; (ii) masking may remove small or critical features, especially in fine-grained tasks. In contrast, masked image modeling (MIM) has demonstrated that masked regions can be reconstructed from partial input, revealing that even incomplete data can exhibit strong contextual consistency with the original image. This highlights the potential of masked regions as sources of semantic diversity. Motivated by this, we revisit the image masking approach, proposing to treat masked content as auxiliary knowledge rather than ignored. Based on this, we propose MaskAnyNet, which combines masking with a relearning mechanism to exploit both visible and masked information. It can be easily extended to any model with an additional branch to jointly learn from the recomposed masked region. This approach leverages the semantic diversity of the masked regions to enrich features and preserve fine-grained details. Experiments on CNN and Transformer backbones show consistent gains across multiple benchmarks. Further analysis confirms that the proposed method improves semantic diversity through the reuse of masked content.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
12 pages

Category
Computer Science:
CV and Pattern Recognition