Data Description

Dataset can be found here, for more information, see this preprint article about the dataset.

The dataset consists of 155 primary and 155 metastatic melanoma manually selected H&E stained ROIs, scanned at 40× magnification (0.22 µm/px) with a resolution of 1024 × 1024 pixels. For these ROIs, annotations of both tissue and nuclei are provided, along with a context ROI of 5120 × 5120 pixels centered around the ROI. The annotations were created by a medical expert and checked and corrected by a dermatopathologist. All cases were digitized at a large melanoma referral center, although 97 cases were revisions or consultations from other treatment hospitals and general practitioners. Annotations are in the .GeoJSON format, making them easily visualizable using the open-source pathology image viewer Qupath [23, 24].

Figure 1. Distribution of nuclei in the datasets.

Training Dataset A total of 103 primary and 103 metastatic melanoma ROIs have been made publicly available. The public set contains 97,429 nuclei across 103 primary melanoma ROIs and 103 metastatic melanoma ROIs.

Preliminary Test Set (10 Samples) A preliminary test set of 5 primary and 5 metastatic melanoma ROIs will be provided for model validation during the challenge. These ROIs contain 4,860 nuclei and are used for testing models in the initial phase of the competition.

Final Test Set (94 Samples) The remaining 47 primary and 47 metastatic melanoma ROIs are kept private as an independent final test set to be used in the PUMA challenge. This set contains 45,406 nuclei and will only be used for the final evaluation phase.

Figure 2. Distribution of tissue class area in the datasets.

Figure 3. Dataset example.