Effects of interrater reliability of psychopathologic assessment on power and sample size calculations in clinical trials

Matthias J Müller; Armin Szegedi

doi:10.1097/00004714-200206000-00013

Effects of interrater reliability of psychopathologic assessment on power and sample size calculations in clinical trials

J Clin Psychopharmacol. 2002 Jun;22(3):318-25. doi: 10.1097/00004714-200206000-00013.

Authors

Matthias J Müller¹, Armin Szegedi

Affiliation

¹ Department of Psychiatry, University of Mainz, Germany. mjm@mail.psychiatrie.klinik.uni-mainz.de

PMID: 12006903
DOI: 10.1097/00004714-200206000-00013

Abstract

Although rater training is increasingly used to improve the quality of the investigated outcome parameters, the reliability of assessments is not perfect. Thus, empirical reliability estimates should be used instead of theoretically assumed perfect reliability. Implications of the reliability of psychiatric assessments for sample size and power calculations in clinical trials are presented. The theoretical basis of sample size and power calculations using empirical reliability scores is delineated. Examples from contemporary research on schizophrenia and depression are used to illustrate several implications for study design and interpretation of results. The tremendous impact of the lack of reliability of psychopathologic assessments on sample size, power, and detectable true score differences in clinical trials is shown. The problem of multiple outcome variables with different reliabilities is addressed. Studies lacking power because of unreliable assessments carry the risk of false-negative findings and raise ethical questions. Rater training is strongly recommended to assess and improve interrater reliability whenever necessary and possible before trials are started. Sample size calculations and power analysis should be based on empirical reliability values of outcome parameters as part of quality assurance and cost savings.

MeSH terms

Clinical Trials as Topic / methods
Clinical Trials as Topic / statistics & numerical data*
Humans
Observer Variation*
Psychopathology
Sample Size