Date of Award
7-1-2024
Thesis Type
phd
Document Type
Thesis (Restricted Access)
Divisions
eng
Department
Department of Biomedical Engineering
Institution
Universiti Malaya
Abstract
Coronavirus disease (COVID-19) is evolving rapidly and caused the rise in hospital readmission. To mitigate the rate of hospital readmission, a retrospective study was carried out on 1578 COVID-19 patients admitted in Universiti Malaya Medical Centre (UMMC) from May 2020 to January 2022. This study aimed to utilize the technology of machine learning and deep learning in the prediction of readmission risk with three main objectives, to identify potential clinical risk factors leading to COVID-19 readmission, build a predictive model to prognosticate unplanned hospital readmission, and lastly to analyse the characteristics, duration of treatment and recovery rate of readmitted COVID-19 patients in Malaysia. This study consists of three phases, commencing with the preliminary stage, where medical ethics approval was obtained for data collection at UMMC. Following data acquisition, cleaning, and preprocessing, unstructured data underwent Bag of Words analysis through Natural Language Processing (NLP), while statistical analyses and correlation tests were executed on refined patient data. Feature selection, using Recursive Feature Elimination (RFE) technique, preceded the construction and training of three machine learning models: Logistic Regression, Decision Tree Classifier and Support Vector Machine. Logistic Regression performed the best (0.919 accuracy, 0.636 area under curve (AUC)). Advancing to the progressing phase, 443 data was expanded to 1578, with COVID-19 readmission rate of 8.68%. The dataset expansion prompted the re-computation of statistical analyses, feature selection, and machine learning processes. A total of six machine learning models were developed and trained, namely Logistic Regression, Decision Tree Classifier, Support Vector Machine, Random Forest, eXtreme Gradient Boosting and Category Boosting. Concurrently, six deep learning models were developed and trained after data balancing was executed, namely Multilayer Perceptron, TabNet, Value Imputation and Mask Estimation, TabTransformer, Deep Factorial Machine, and Regularization Learning Model. While machine learning performed better than deep learning, Logistic Regression stood out among the models (0.946 accuracy, 0.639 AUC). For analysis of readmitted patients, most patients had length of stay (LOS) of 7 days or less (76.64%), and majority returned to hospital within a 90-day-interval (70.8%), indicating a good recovery rate for COVID-19 in the observed population. In the finalizing phase, various feature selection techniques were employed to discern the risk factors for COVID-19 readmission. 7 clinical risk factors of COVID-19 readmission are finalized, namely heart rate, cough, age, LOS, diabetes mellitus, hyperparathyroidism, and asthma. Ultimately, a novel Slime Mold Algorithm (SMA) integrated hybrid predictive model was developed. By integrating SMA into Support Vector Machine (SVM), the predictive model achieved an accuracy of 0.946 and AUC of 0.734.
Note
Thesis (PhD) - Faculty of Engineering, Universiti Malaya, 2024.
Recommended Citation
Loo, Wei Kit, "Hospital readmission risk prediction of COVID-19 patients using machine learning / Loo Wei Kit" (2024). Student Works (2020-2029). 1757.
https://knova.um.edu.my/student_works_2020s/1757