Classification versus association models: should the same methods apply?
Abstract
Association and classification models differ fundamentally in objectives, measurements, and clinical context specificity. Association studies aim to identify biomarker association with disease in a study population and provide etiologic insights. Common association measurements are odds ratio, hazard ratio, and correlation coefficient. Classification studies aim to evaluate biomarker use in aiding specific clinical decisions for individual patients. Common classification measurements are sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Good association is usually a necessary, but not a sufficient, condition for good classification. Methods for developing classification models have mainly used the criteria for association models, usually minimizing total classification error without consideration of clinical application settings, and therefore are not optimal for classification purposes. We suggest that developing classification models by focusing on the region of receiver operating characteristic (ROC) curve relevant to the intended clinical application optimizes the model for the intended application setting.