Evaluating Applicability of Different COVID-19 Predictive Risk Models on Estonian Health Data

Name
Marc David
Abstract
Since the start of the spread of the novel COVID-19 disease in 2019, the workload on health care systems in the world has increased. Different risk prediction models could be used to optimize the use of health care resources, as it could predict how severe the course of the disease could be for the patient. If the model is accurate enough, it could be used for multiple things, for instance finding patients who would need a vaccine or hospital care the most. A big factor in model performance is the size of the dataset used to train it. Since Estonia has a relatively small amount of health data, then the developing of a new model is difficult. Externally validating a model that is already trained on bigger datasets is more cost-effective and simpler to do.
The goal of this thesis was to externally validate already existing risk models and analyse the potential of their use in a practical way. The results of the validation show that the models’ discrimination, as displayed by their AUC and AUPRC values on the Estonian health data is fairly good. However, calibration was poor, as the model predicted a much lower risk probability for patients compared to the observed probability. This could be explained by a bias in the Estonian health data that was used for validation. A barrier for the models’ potential use in a practical setting is the fact that these models and the data used to validate the model are outdated, as they are both from the beginning of the pandemic. This means that the models don’t consider newer virus mutations or a patient’s vaccination history. A solution to these problems would be to train a new model with newer and better data.
Graduation Thesis language
Estonian
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Raivo Kolde
Defence year
2021
 
PDF