Elsevier

Academic Radiology

Volume 5, Issue 9, September 1998, Pages 591-602
Academic Radiology

Original investigation
Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: Factorial experimental design

https://doi.org/10.1016/S1076-6332(98)80294-8Get rights and content

Rationale and Objectives.

The authors conducted a series of null-case Monte Carlo simulations to evaluate the Dorfman-Berbaum-Metz (DBM) method for comparing modalities with multireader receiver operating characteristic (ROC) discrete rating data.

Materials and Methods.

Monte Carlo simulations were performed by using discrete ratings on fully crossed factorial designs with two modalities and three, five, and 10 hypothetical readers. The null hypothesis was true for all simulations. The population ROC areas, latent variable structures, case sample sizes, and normal/abnormal case sample ratios used in another study were used in these simulations.

Results.

For equal allocation ratios and small (Az = 0.702) and moderate (Az = 0.855) ROC areas, the empirical type I error rate closely matched the nominal α level. For very large ROC areas (Az = 0.961), however, the empirical type I error rate was somewhat smaller than the nominal α level. This conservatism increased with decreasing case sample size and asymmetric normal/abnormal case allocation ratio. The empirical type I error rate was sometimes slightly larger than the nominal α level with many cases and few readers, where there was large residual, relatively small treatment-by-case interaction and relatively large treatment-by—reader interaction.

Conclusion.

The results suggest that the DBM method provides trustworthy α levels with discrete ratings when the ROC area is not too large and case and reader sample sizes are not too small. In other situations, the test tends to be somewhat conservative or slightly liberal.

References (33)

  • SwetsJA et al.

    Evaluation of diagnostic systems: methods from signal detection theory

    (1982)
  • HanleyJA et al.

    A method of comparing the areas under receiver operating characteristic curves derived from the same cases

    Radiology

    (1983)
  • McNeilBJ et al.

    Statistical approaches to the analysis of receiver operating characteristic (ROC) curves

    Med Decis Making

    (1984)
  • WinerBJ

    Statistical principles in experimental design

    (1971)
  • Advances in statistical methods for diagnostic radiology: a symposium

    Acad Radiol

    (1995)
  • ObuchowskiNA et al.

    Simple steps for improving multiple-reader studies in radiology

    AJR

    (1996)
  • Cited by (94)

    • Recurrent Thyroid Cancer Diagnosis. ROC Study of the Effect of a High-Resolution Head and Neck <sup>18</sup>F-FDG PET/CT Scan.

      2014, Academic Radiology
      Citation Excerpt :

      The DBM MRMC 2.32 software with the PROPROC fitting methodology was used for data analysis. The software was downloaded from the Medical Image Perception laboratory website (http://perception.radiology.uiowa.edu) and the Kurt Rossmann Laboratories for Radiologic Image Research website (http://www-radiology.uchicago.edu/krl/) (6–12). The same software was used to perform a statistical significance test at the 0.05 level and to estimate standard errors (SEs) and 95% confidence intervals (CIs).

    • Simulation of Unequal-Variance Binormal Multireader ROC Decision Data. An Extension of the Roe and Metz Simulation Model.

      2012, Academic Radiology
      Citation Excerpt :

      For nonparametric estimation continuous decision-variable values were used; for semiparametric estimation. decision values were transformed to ordinal discrete ratings, taking integer values from one to five, using the same cut points as Dorfman et al (2). In addition, trapezoid-rule AUC simulations were repeated with the number of normal cases tripled: 25 abnormals and 75 normals, 50 abnormals and 150 normals, and 100 abnormals and 300 normals.

    • Multi-reader ROC Studies with Split-plot Designs. A Comparison of Statistical Methods

      2012, Academic Radiology
      Citation Excerpt :

      We investigated scenarios in which the readers' average ROC area with the two modalities was the same (null hypothesis) and scenarios in which the readers' average ROC area with the two modalities increased by a small amount (0.030–0.032) (alternative hypothesis). The values for the variance components were selected from values used by Roe and Metz (27) and Dorfman et al (28). We generated test scores from an equal-variance binormal distribution (ie, binormal parameter b = 1); the same variance components were used for diseased and nondiseased patients.

    View all citing articles on Scopus

    Supported in part by National Institutes of Health grants R01 CA 62362 (D.D.D., K.S.B., R.V.L., Y.F.C., B.D.) and R01 CA 42453 (D.D.D. and K.S.B.) and in part by U.S. Army Medical Research and Materiel Command grant DAMD17-96-1-6254 (D.D.D., K.S.B., R.V.L.).

    View full text