404
Table 2. Risk Score Sensitivity, Specificity, and Overall Classification Accuracy at Select Cutoff Points for Predicting Extended-Spectrum β-Lactamase (ESBL) Status in a Cohort of Adult Patients with Escherichia coli and Klebsiella Species Bacteremiaa
Risk Score Cutoff Point
≥0
≥.25 ≥.5
≥.75 ≥1
≥1.25 ≥1.5
≥1.75 ≥2
≥2.25 ≥2.5
≥2.75 ≥3
≥3.25 ≥3.5
≥3.75 ≥4
≥4.25 ≥4.5
≥4.75 ≥5
≥5.25 ≥5.5
≥5.75 ≥6
≥6.25 ≥6.5
≥6.75 ≥7
≥7.25 ≥7.5
≥7.75 ≥8
≥8.25 ≥8.5
≥8.75 ≥9
≥9.25
Sensitivity, %
100.0 95.4 94.9 93.8 93.3 90.7 89.7 89.2 88.7 85.6 84.0 83.5 83.5 77.8 74.2 71.7 70.6 65.5 64.4 63.9 63.9 61.9 61.3 60.8 60.8 55.2 54.6 54.6 54.1 49.5 46.9 46.4 45.9 40.2 38.7 38.1 37.6 31.4
Specificity, %
0.7
31.5 35.7 37.0 38.8 51.3 54.1 55.6 56.7 70.2 71.6 72.5 73.1 83.4 86.8 87.7 88.3 92.6 92.8 93.2 93.4 95.7 96.2 96.6 97.0 98.2 98.4 98.5 98.5 99.5 99.5 99.5 99.5 99.5 99.7 99.8 99.8
100.0
Observations Correctly Classified, %
15.7 41.2 44.6 45.6 47.0 57.2 59.5 60.6 61.5 72.5 73.5 74.2 74.7 82.5 84.9 85.3 85.6 88.5 88.5 88.8 89.0 90.6 90.9 91.2 91.5 91.7 91.8 91.9 91.9 91.9 91.5 91.5 91.4 90.6 90.5 90.5 90.5 89.7
Note. CI, confidence
interval.aCutoff points<0 and≥9.5 were excluded because, respectively, they yielded equal sensitivity (100%) but inferior specificity, or inferior sensitivity but equal specificity (100%). Dark gray shading indicates the cutoff point that maximized overall classification accuracy (≥7.25 points).
community- and hospital-onset ESBL or third-generation cepha- losporin-resistant bacteremia in other populations.15,16 Taken together with the risk score’s similar C statistic following
Katherine E. Goodman et al
Table 3. Comparative Performance Metrics of a Logistic Regression-Derived Clinical Risk Score and a Machine Learning-Derived Decision Tree to Predict Extended-Spectrum β-Lactamase (ESBL) Status
Variable
No. of included variables Sensitivity, %a Specificity, %a
Positive predictive value (PPV), %a Negative predictive value (NPV), %a Naïve C statistic
Cross-validated C statistic
Risk Score 14
49.5 99.5 94.6 91.8 0.87 0.89
Decision Tree 5
51.0 99.1 90.8 91.9 0.77 0.77
aRisk score values vary depending upon the selected cutoff point for dichotomization. Values reflected for the risk score are for the cutoff point of ≥7.25 points, which optimized overall classification accuracy.
cross-validation (0.89), this evidence suggests that despite the inclusion of a large number of variables, the risk score was not overfit.
Given that risk scores for binary predictions are dichotomized
at a cutoff point, in practice the risk score and the decision tree performed similarly: sensitivities 49.5% and 51.0% and specificities 99.5% and 99.1%, respectively. However, the risk score had a ~10% higher area-under-the-curve (risk score and decision tree C statis- tics: 0.87 vs 0.77). This higher AUC offers users more latitude to prioritize sensitivity over specificity, or vice versa, by changing the cutoff point (as discussed in more detail below). In theory, a decision tree could also be developed to optimize a different bal- ance of sensitivity and specificity, but this would require deriving an entirely new tree. The risk score’s greater flexibility, however, came at a cost of low user-friendliness for manual application. Studies consistently demonstrate that incorporating decision sup- port tools at the point of care is important to their success,17 but manual tabulation of 14 variables would encounter significant bed- side utilization barriers. In contrast, decision-tree branching logic does not require end-user calculations and, at least in this ESBL case study, the final decision tree included far fewer (ie, 5) predictors. The potential tradeoff between flexibility and user friendliness
is an important consideration when evaluating whether risk scores or decision trees are a more suitable decision support tool for a given application. Additional considerations, however, may also help to guide researchers in selecting one option versus the other. Below, we summarize the relative strengths of risk scores and deci- sion trees for model development and fitting, implementation, and adaptability. Of note, the CART analysis is the tree-fitting process (approach), and a decision tree is the result (output), just as logistic regression is a common (but by no means the only or necessarily even the preferred) approach for developing a risk score. Approach and output can differ in their strengths and limitations, and we dis- tinguish these concepts in our discussion. Methodological differences between logistic regression and
CART influence the data assumptions and exploratory analyses required for model development and fitting. In general, the more complex or challenging the underlying data, the more utility a machine learning approach can provide. Specifically, logistic regression imposes important data requirements, including mini- mal collinearity (ie, correlation) among independent variables and a sufficient ratio of cases to predictors (ie, sufficient sample size; a general, although debatable, guideline is 10 expected cases per
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64 |
Page 65 |
Page 66 |
Page 67 |
Page 68 |
Page 69 |
Page 70 |
Page 71 |
Page 72 |
Page 73 |
Page 74 |
Page 75 |
Page 76 |
Page 77 |
Page 78 |
Page 79 |
Page 80 |
Page 81 |
Page 82 |
Page 83 |
Page 84 |
Page 85 |
Page 86 |
Page 87 |
Page 88 |
Page 89 |
Page 90 |
Page 91 |
Page 92 |
Page 93 |
Page 94 |
Page 95 |
Page 96 |
Page 97 |
Page 98 |
Page 99 |
Page 100 |
Page 101 |
Page 102 |
Page 103 |
Page 104 |
Page 105 |
Page 106 |
Page 107 |
Page 108 |
Page 109 |
Page 110 |
Page 111 |
Page 112 |
Page 113 |
Page 114 |
Page 115 |
Page 116 |
Page 117 |
Page 118 |
Page 119 |
Page 120 |
Page 121 |
Page 122