Score: 0

A Multimodal and Multi-centric Head and Neck Cancer Dataset for Segmentation, Diagnosis and Outcome Prediction

Published: August 30, 2025 | arXiv ID: 2509.00367v3

By: Numan Saeed , Salma Hassan , Shahad Hardan and more

Potential Business Impact:

Helps doctors find and treat head cancer.

Business Areas:
Image Recognition Data and Analytics, Software

We present a publicly available multimodal dataset for head and neck cancer research, comprising 1123 annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies from patients with histologically confirmed disease, acquired from 10 international medical centers. All studies contain co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity from a long-term, multi-institution retrospective collection. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following established guidelines. We provide anonymized NifTi files, expert-annotated segmentation masks, comprehensive clinical metadata, and radiotherapy dose distributions for a patient subset. The metadata include TNM staging, HPV status, demographics, long-term follow-up outcomes, survival times, censoring indicators, and treatment information. To demonstrate its utility, we benchmark three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, using state-of-the-art deep learning models like UNet, SegResNet, and multimodal prognostic frameworks.

Country of Origin
🇦🇪 United Arab Emirates

Page Count
10 pages

Category
Computer Science:
CV and Pattern Recognition