This article requires a subscription to view the full text. If you have a subscription you may use the login form below to view the article. Access to this article can also be purchased.
Graphical Abstract
Abstract
BACKGROUND AND PURPOSE: Accurate glioma segmentation has the potential to enhance clinical decision-making and treatment planning. Uncertainty quantification methods, including conformal prediction (CP), can enhance segmentation models reliability. CP quantifies uncertainty with statistical confidence guarantees. This study aimed to use CP in glioma segmentation.
MATERIALS AND METHODS: We used the publicly available University of California San Francisco (UCSF) and University of Pennsylvania (UPenn) glioma data sets, with the UCSF data set (495 cases) split into training (70%), validation (10%), calibration (10%), and test (10%) sets, and the UPenn data set (147 cases) divided into external calibration (30%) and external test (70%) sets. A UNet model was trained, and its optimal threshold was set to 0.5 using prediction normalization. To apply CP, we selected the conformal threshold on the basis of the internal/external calibration nonconformity score, and CP was subsequently applied to the internal/external test sets with coverage. The proportion of true labels within prediction sets was reported for all. We defined the uncertainty ratio (UR) and assessed its correlation with the DSC and 95th percentile Hausdorff distance (HD95). Additionally, we categorized cases into certain and uncertain groups on the basis of UR and compared their DSC and HD95. We also evaluated the correlation between UR and the evaluation metrics (DSC and HD95) of the Brain Tumor Segmentation (BraTS) fusion model segmentation and compared evaluation metrics in the certain and uncertain subgroups.
RESULTS: The base model achieved a DSC of 0.86 and 0.83 and an HD95 of 7.35 and 11.71 on the internal and external test sets, respectively. The CP coverage was 0.9982 for the internal test set and 0.9977 for the external test set. Statistical analysis showed significant correlations between UR and the evaluation metrics for test sets (P value < .001). Additionally, certain cases had significantly better evaluation metrics (higher DSC and lower HD95) than uncertain cases in the test sets and the BraTS fusion model segmentation (P value < .001).
CONCLUSIONS: CP effectively quantifies uncertainty in glioma segmentation. Using conformal segmentation (CONSeg) improves the reliability of segmentation models and enhances human-computer interactions. Additionally, CONSeg can identify uncertain cases and suggest them for manual segmentation.
ABBREVIATIONS:
- BCE
- binary cross-entropy
- BFMS
- BraTS fusion model segmentation
- BMOT
- base model optimal threshold
- BMPN
- base model prediction normalization
- BraTS
- Brain Tumor Segmentation
- CONSeg
- conformal segmentation
- CP
- conformal prediction
- DL
- deep learning
- DSC
- Dice score coefficient
- HD95
- Hausdorff distance 95th percentile
- NCST
- nonconformity score threshold
- UCSF
- University of California, San Francisco
- UPenn
- University of Pennsylvania
- UQ
- uncertainty quantification
- UR
- uncertainty ratio
- © 2025 by American Journal of Neuroradiology
ASNR members
Login to the site using your ASNR member credentials








