Score: 1

Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods

Published: November 28, 2025 | arXiv ID: 2511.23241v1

By: Jose Moises Araya-Martinez , Adrián Sanchis Reig , Gautham Mohan and more

BigTech Affiliations: Mercedes-Benz

Potential Business Impact:

Makes robots learn from fake pictures, not real ones.

Business Areas:
Image Recognition Data and Analytics, Software

Reducing the burden of data generation and annotation remains a major challenge for the cost-effective deployment of machine learning in industrial and robotics settings. While synthetic rendering is a promising solution, bridging the sim-to-real gap often requires expert intervention. In this work, we benchmark a range of domain randomization (DR) and domain adaptation (DA) techniques, including feature-based methods, generative AI (GenAI), and classical rendering approaches, for creating contextualized synthetic data without manual annotation. Our evaluation focuses on the effectiveness and efficiency of low-level and high-level feature alignment, as well as a controlled diffusion-based DA method guided by prompts generated from real-world contexts. We validate our methods on two datasets: a proprietary industrial dataset (automotive and logistics) and a public robotics dataset. Results show that if render-based data with enough variability is available as seed, simpler feature-based methods, such as brightness-based and perceptual hashing filtering, outperform more complex GenAI-based approaches in both accuracy and resource efficiency. Perceptual hashing consistently achieves the highest performance, with mAP50 scores of 98% and 67% on the industrial and robotics datasets, respectively. Additionally, GenAI methods present significant time overhead for data generation at no apparent improvement of sim-to-real mAP values compared to simpler methods. Our findings offer actionable insights for efficiently bridging the sim-to-real gap, enabling high real-world performance from models trained exclusively on synthetic data.

Country of Origin
🇩🇪 Germany

Page Count
6 pages

Category
Computer Science:
CV and Pattern Recognition