Biased odds ratios from dichotomization of age

Stat Med. 2007 Aug 15;26(18):3487-97. doi: 10.1002/sim.2737.

Abstract

Dichotomizing a continuous variable is known to result in the loss of information, lower statistical power, and lower reliability. In many epidemiological studies, age is a scaled (continuous) variable prior to statistical analyses; however, despite pleas from methodologists, researchers frequently dichotomize age in their data analysis without an appropriate rationale. Using simulated case-control data, we show that dichotomizing age generally will lead to a biased odds ratio (OR). When age was a confounder (potentially representing common causes of risks and outcomes), including age as a scaled variable (whether the age effect was linear or non-linear in the logit), provided satisfactory control, whereas when age was categorized, the estimated risk factor effect was biased. We also demonstrate that the further the cutpoint is from the median age, the greater the increase in the OR; thus, in cases where age dichotomization is warranted, researchers are cautioned not to allow the size of the empirical OR to influence their choice of cutpoint. Recommendations are made for analysing age in epidemiological data and interpretation of empirical findings.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Age Factors
  • Bias*
  • Confounding Factors, Epidemiologic
  • Data Interpretation, Statistical*
  • Epidemiologic Studies*
  • Humans
  • Middle Aged
  • Odds Ratio*
  • Reproducibility of Results