Measuring the utility of increased care and testing, inputs that aren’t always immediately available
Consumers are familiar with how algorithms work. Amazon, using your prior buying habits, suggests similar purchases; YouTube, based on what you’ve already watched, offers related videos. The consumer can take it or leave it, and the company has spent little to make the offer. If the algorithms don’t work for you, not such a big deal.
But what about in a far more critical setting? One with countless variables, such as a hospital? Algorithms certainly hold out the promise to improve medical decision-making and reduce costs, but implementing just one — and there is the potential for thousands in a major medical center — is a complex task and requires careful monitoring of its performance and patient outcomes.
A favorite target of performance improvement methods — manual or digital — at hospitals is readmissions. When patients have to return to the hospital after a surgery or treatment, it suggests they were discharged in error. Government reimbursement programs often carry penalties for readmissions, and, obviously, there is considerable risk to patients.
Can Algorithms Really Identify Patients at Risk of Readmission?
In a paper published in Nature npj Digital Medicine, UCLA Medicine’s Eilon Gabel and UCLA Anderson’s Velibor V. Mišić and Kumar Rajaram explore how best to evaluate possible algorithms, or machine learning models, that would seek to identify patients at risk of readmission.
Understanding the value of a machine learning model from a clinical perspective is not straightforward. In machine learning, a commonly used metric is something called the area-under-the-curve, which addresses the following issue. Suppose we are given a patient who will be readmitted, and a patient who will not be readmitted, but we do not know which is which. What is the chance that the model identifies the higher risk for the patient who goes on to be readmitted than the patient who is not?
If we opted to flip a coin and make a random guess as to which one will be readmitted, we would be right only 50% of the time. If we had a perfect model, we would be right 100% of the time. So this metric ranges from 50% to 100%. In practice, an AUC of 70% is considered fair; an AUC of 80% is considered good; and an AUC of 90% is considered excellent.
But AUC is abstracted from the realities of clinical decision making. It does not take into account that a provider is using the model to identify patients and is operating under a limited schedule. It does not take into account the cost savings of each readmission that the model correctly anticipates or the costs of the provider needed to operationalize that model.
Using data on 19,331 surgical admissions to the Ronald Reagan UCLA Medical Center during an 847-day period ending in 2018 — and the 969 patients (5% of total) readmitted to the hospital’s emergency department within 30 days of being discharged — the goal is to see whether, and how well, machine learning models could have identified that subset of patients and thus prevented readmissions.
The authors test four machine learning models and examine their performance at various levels of availability of care providers (physician or nurse). In addition to improving patient outcomes, an algorithm needs to fit into the workflow of care providers and also be cost effective, the authors note.
Balancing Resources with Efforts to Reduce Readmissions
To be sure, unlimited patient access to a physician or nurse’s time would reduce readmissions. The models’ job is to stop potential readmissions without wasting those valuable resources on patients who’ll do fine after discharge.
Surgeons and specialists aren’t always available every day, so the authors test the four models under three provider schedules:
- Provider sees eight patients every Monday
- Provider sees eight patients on Monday and eight on Wednesday
- Provider sees eight patients per day, Monday to Friday
Two of the models, for instance, are only applied on the date of a patient’s discharge, so that if a patient is discharged on a Tuesday, those models won’t select that patient to see a provider on the first two schedules. Thus, they yield lower figures on the “patients seen” metric.
When provider availability is limited, the use of a more sophisticated machine learning model that incorporates lab test results improves readmission results significantly. As provider availability increases, the difference between a model that uses lab results and one that eschews labs narrows.
Cost savings roughly track readmissions, though provider time is expensive and it reduces savings.
The authors’ results show that models that rigidly call for examination, say, on day of discharge, are less valuable for predicting patient readmission and for cost savings.
This could be because care providers aren’t available on that day, or, conversely, they may see patients who are discharged on that day, even if that patient is not the most in need of their care.
The authors’ simulation model is helpful for hospital administrators and staff as they determine which machine learning models will be useful for their particular clinical setting and level of staffing and resources.
About the Research
Gabel, E., Mišić, V., Rajaram, K. (2021). A Simulation-Based Evaluation of Machine Learning Models for Clinical Decision Support: Application and Analysis Using Hospital Readmission. Nature npj Digital Medicine, 4, 98. https://doi.org/10.1038/s41746-021-00468-7