Score: 0

STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge

Published: April 11, 2025 | arXiv ID: 2504.08306v1

By: Kehuan Song , Xinglin Xie , Kexin Zhang and more

Potential Business Impact:

Lets computers track moving things in videos better.

Business Areas:
Image Recognition Data and Analytics, Software

Segmentation of video objects in complex scenarios is highly challenging, and the MOSE dataset has significantly contributed to the development of this field. This technical report details the STSeg solution proposed by the "imaplus" team.By finetuning SAM2 and the unsupervised model TMO on the MOSE dataset, the STSeg solution demonstrates remarkable advantages in handling complex object motions and long-video sequences. In the inference phase, an Adaptive Pseudo-labels Guided Model Refinement Pipeline is adopted to intelligently select appropriate models for processing each video. Through finetuning the models and employing the Adaptive Pseudo-labels Guided Model Refinement Pipeline in the inference phase, the STSeg solution achieved a J&F score of 87.26% on the test set of the 2025 4th PVUW Challenge MOSE Track, securing the 1st place and advancing the technology for video object segmentation in complex scenarios.

Page Count
7 pages

Category
Computer Science:
CV and Pattern Recognition