Score: 1

OTR: Synthesizing Overlay Text Dataset for Text Removal

Published: October 3, 2025 | arXiv ID: 2510.02787v1

By: Jan Zdenek, Wataru Shimoda, Kota Yamaguchi

Potential Business Impact:

Cleans text from pictures better.

Business Areas:
Text Analytics Data and Analytics, Software

Text removal is a crucial task in computer vision with applications such as privacy preservation, image editing, and media reuse. While existing research has primarily focused on scene text removal in natural images, limitations in current datasets hinder out-of-domain generalization or accurate evaluation. In particular, widely used benchmarks such as SCUT-EnsText suffer from ground truth artifacts due to manual editing, overly simplistic text backgrounds, and evaluation metrics that do not capture the quality of generated results. To address these issues, we introduce an approach to synthesizing a text removal benchmark applicable to domains other than scene texts. Our dataset features text rendered on complex backgrounds using object-aware placement and vision-language model-generated content, ensuring clean ground truth and challenging text removal scenarios. The dataset is available at https://huggingface.co/datasets/cyberagent/OTR .

Repos / Data Links

Page Count
7 pages

Category
Computer Science:
CV and Pattern Recognition