Score: 0

On the Effectiveness of Textual Prompting with Lightweight Fine-Tuning for SAM3 Remote Sensing Segmentation

Published: December 17, 2025 | arXiv ID: 2512.15564v1

By: Roni Blushtein-Livnon , Osher Rafaeli , David Ioffe and more

Potential Business Impact:

Lets computers understand satellite pictures with words.

Business Areas:

Semantic Search Internet Services

Remote sensing (RS) image segmentation is constrained by the limited availability of annotated data and a gap between overhead imagery and natural images used to train foundational models. This motivates effective adaptation under limited supervision. SAM3 concept-driven framework generates masks from textual prompts without requiring task-specific modifications, which may enable this adaptation. We evaluate SAM3 for RS imagery across four target types, comparing textual, geometric, and hybrid prompting strategies, under lightweight fine-tuning scales with increasing supervision, alongside zero-shot inference. Results show that combining semantic and geometric cues yields the highest performance across targets and metrics. Text-only prompting exhibits the lowest performance, with marked score gaps for irregularly shaped targets, reflecting limited semantic alignment between SAM3 textual representations and their overhead appearances. Nevertheless, textual prompting with light fine-tuning offers a practical performance-effort trade-off for geometrically regular and visually salient targets. Across targets, performance improves between zero-shot inference and fine-tuning, followed by diminishing returns as the supervision scale increases. Namely, a modest geometric annotation effort is sufficient for effective adaptation. A persistent gap between Precision and IoU further indicates that under-segmentation and boundary inaccuracies remain prevalent error patterns in RS tasks, particularly for irregular and less prevalent targets.

Visual and Text Prompt Segmentation: A Novel Multi-Model Framework for Remote Sensing

Multimedia

Maps land from space better, even small things.

10 Mar 2025 1

90%

ReSAM: Refine, Requery, and Reinforce: Self-Prompting Point-Supervised Segmentation for Remote Sensing Images

CV and Pattern Recognition

Teaches computers to perfectly outline things in satellite pictures.

26 Nov 2025 0

89%

Beyond Pixels: A Training-Free, Text-to-Text Framework for Remote Sensing Image Retrieval

CV and Pattern Recognition

Find satellite pictures using words, no training needed.

11 Dec 2025 0

View PDF Login to Bookmark

Country of Origin

🇮🇱 Israel

Page Count

5 pages

On the Effectiveness of Textual Prompting with Lightweight Fine-Tuning for SAM3 Remote Sensing Segmentation

Lets computers understand satellite pictures with words.

Technical Abstract

Visual and Text Prompt Segmentation: A Novel Multi-Model Framework for Remote Sensing

ReSAM: Refine, Requery, and Reinforce: Self-Prompting Point-Supervised Segmentation for Remote Sensing Images

Beyond Pixels: A Training-Free, Text-to-Text Framework for Remote Sensing Image Retrieval