December 19, 2022
Background:
Risk stratification helps guide appropriate clinical care. Our goal was to develop and validate a broad suite of predictive tools based on International Classification of Diseases, Tenth Revision, diagnostic and proce- dural codes for predicting adverse events and care utilization outcomes for hospitalized patients.
Methods:
Endpoints included unplanned hospital admissions, discharge sta- tus, excess length of stay, in-hospital and 90-day mortality, acute kidney injury, sepsis, pneumonia, respiratory failure, and a composite of major cardiac com- plications. Patient demographic and coding history in the year before admis- sion provided features used to predict utilization and adverse events through 90 days after admission. Models were trained and refined on 2017 to 2018 Medicare admissions data using an 80 to 20 learn to test split sample. Models were then prospectively tested on 2019 out-of-sample Medicare admissions. Predictions based on logistic regression were compared with those from five commonly used machine learning methods using a limited dataset
Results:
The 2017 to 2018 development set included 9,085,968 patients who had 18,899,224 inpatient admissions, and there were 5,336,265 patients who had 9,205,835 inpatient admissions in the 2019 validation dataset. Model performance on the validation set had an average area under the curve of 0.76 (range, 0.70 to 0.82). Model calibration was strong with an average R 2 for the 99% of patients at lowest risk of 1.00. Excess length of stay had a root-mean- square error of 0.19 and R 2 of 0.99. The mean sensitivity for the highest 5% risk population was 19.2% (range, 11.6 to 30.1); for positive predictive value, it was 37.2% (14.6 to 87.7); and for lift (enrichment ratio), it was 3.8 (2.3 to 6.1). Predictive accuracies from regression and machine learning techniques were generally similar.
Conclusions
Predictive analytical modeling based on administrative claims history can provide individualized risk profiles at hospital admission that may help guide patient management. Similar results from six different modeling approaches suggest that we have identified both the value and ceiling for predictive information derived from medical claims history.