If Coronavirus (COVID-19) is not predicted, managed, and controlled timely, the health systems of any country and their people will face serious problems. Predictive models can be helpful in health resource management and prevent outbreak and death caused by COVID-19. The present study aimed at predicting mortality in patients with COVID-19 based on data mining techniques. To do this study, the mortality factors of COVID-19 patients were first identified based on different studies. These factors were confirmed by specialist physicians. Based on the confirmed factors, the data of COVID-19 patients were extracted from 850 medical records. Decision tree (J48), MLP, KNN, random forest, and SVM data mining models were used for prediction. The models were evaluated based on accuracy, precision, specificity, sensitivity, and the ROC curve. According to the results, the most effective factor used to predict the death of COVID-19 patients was dyspnea. Based on ROC (1.000), accuracy (99.23%), precision (99.74%), sensitivity (98.25%) and specificity (99.84%), the random forest was the best model in predicting of mortality than other models. After the random forest, KNN5, MLP, and J48 models were ranked next, respectively. Data analysis of COVID-19 patients can be a suitable and practical tool for predicting the mortality of these patients. Given the sensitivity of medical science concerning maintaining human life and lack of specialized human resources in the health system, using the proposed models can increase the chances of successful treatment, prevent early death and reduce the costs associated with long treatments for patients, hospitals and the insurance industry. |
- Kazemi-Arpanahi H, Moulaei K, Shanbehzadeh M. Design and development of a web-based registry for Coronavirus (COVID-19) disease. Med J Islam Repub Iran. 2020;34:68. doi: 10.34171/mjiri.34.68. PubMed PMID: 32974234. PubMed PMCID: PMC7500427.
- Mullins E, Evans D, Viner RM, O’Brien P, Morris E. Coronavirus in pregnancy and delivery: rapid review. Ultrasound Obstet Gynecol. 2020;55(5):586-92. doi: 10.1002/uog.22014. PubMed PMID: 32180292.
- Yao H, Chen JH, Xu YF. Patients with mental health disorders in the COVID-19 epidemic. Lancet Psychiatry. 2020;7(4):e21. doi: 10.1016/S2215-0366(20)30090-0. PubMed PMID: 32199510. PubMed PMCID: PMC7269717.
- Kuwahara K, Kuroda A, Fukuda Y. COVID-19: Active measures to support community-dwelling older adults. Travel Med Infect Dis. 2020;36:101638. doi: 10.1016/j.tmaid.2020.101638. PubMed PMID: 32205272. PubMed PMCID: PMC7270647.
- Moazzami B, Razavi-Khorasani N, Moghadam AD, Farokhi E, Rezaei N. COVID-19 and telemedicine: Immediate action required for maintaining healthcare providers well-being. J Clin Virol. 2020;126:104345. doi: 10.1016/j.jcv.2020.104345. PubMed PMID: 32278298. PubMed PMCID: PMC7129277.
- Keshvardoost S, Bahaadinbeigy K, Fatehi F. Role of telehealth in the management of COVID-19: lessons learned from previous SARS, MERS, and Ebola outbreaks. Telemed J E Health. 2020;26(7):850-2. doi: 10.1089/tmj.2020.0105. PubMed PMID: 32329659.
- Ayyoubzadeh SM, Ayyoubzadeh SM, Zahedi H, Ahmadi M, Kalhori SR. Predicting COVID-19 incidence through analysis of google trends data in iran: data mining and deep learning pilot study. JMIR Public Health Surveill. 2020;6(2):e18828. doi: 10.2196/18828. PubMed PMID: 32234709. PubMed PMCID: PMC7159058.
- Grissom CK, Brown SM, Kuttler KG, Boltax JP, et al. A modified sequential organ failure assessment score for critical care triage. Disaster Med Public Health Prep. 2010;4(4):277-84. doi: 10.1001/dmp.2010.40. PubMed PMID: 21149228. PubMed PMCID: PMC3811929.
- Hadi WE, El-Khalili N, AlNashashibi M, Issa G, AlBanna AA. Application of data mining algorithms for improving stress prediction of automobile drivers: A case study in Jordan. Comput Biol Med. 2019;114:103474. doi: 10.1016/j.compbiomed.2019.103474. PubMed PMID: 31585402.
- Bramer M. Principles of data mining. London: Springer; 2007.
- Mengistie TT. COVID-19 Outbreak Data Analysis and Prediction Modeling Using Data Mining Technique. International Journal of Computer (IJC). 2020;38(1):37-60.
- Rivo E, De La Fuente J, Rivo Á, et al. Cross-Industry Standard Process for data mining is applicable to the lung cancer surgery domain, improving decision making as well as knowledge and quality management. Clin Transl Oncol. 2012;14(1):73-9. doi: 10.1007/s12094-012-0764-8. PubMed PMID: 22262722.
- Muhammad LJ, Islam MM, Usman SS, Ayon SI. Predictive data mining models for novel coronavirus (COVID-19) infected patients’ recovery. SN Comput Sci. 2020;1(4):206. doi: 10.1007/s42979-020-00216-w. PubMed PMID: 33063049. PubMed PMCID: PMC7306186.
- Mousavi A, Rezaei S, Salamzadeh J, Mirzazadeh A, Peiravian F, Yousefi N. Value of laboratory tests in COVID-19 hospitalized patients for clinical decision-makers: a predictive model, using data mining approach. Research Square. 2020. doi: 10.21203/rs.3.rs-56252/v1.
- Daberdaku S, Tavazzi E, Di Camillo B. A combined interpolation and weighted K-nearest neighbours approach for the imputation of longitudinal ICU laboratory data. Journal of Healthcare Informatics Research. 2020;4(2):1-15. doi: 10.1007/s41666-020-00069-1.
- Liu J, Lan H, Fu Y, Wu H, Li P. Analyzing electricity consumption via data mining. Wuhan University Journal of Natural Sciences. 2012;17(2):121-5. doi: 10.1007/s11859-012-0815-6.
- Kim S, Kim W, Park RW. A comparison of intensive care unit mortality prediction models through the use of data mining techniques. Healthc Inform Res. 2011;17(4):232-43. doi: 10.4258/hir.2011.17.4.232. PubMed PMID: 22259725. PubMed PMCID: PMC3259558.
- Ahouz F, Golabpour A. Predicting the incidence of COVID-19 using data mining. BMC Public Health. 2020. doi: 10.21203/rs.3.rs-21247/v3.
- Weinberger KQ, Saul LK. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research. 2009;10(2):207-44.
- Kubota R, Uchino E, Suetake N. Hierarchical k-nearest neighbor classification using feature and observation space information. IEICE Electronics Express. 2008;5(3):114-9. doi: 10.1587/elex.5.114.
- Liu D, Li L, Wu X, Zheng D, Wang J, Yang L, Zheng C. Pregnancy and perinatal outcomes of women with coronavirus disease (COVID-19) pneumonia: a preliminary analysis. AJR Am J Roentgenol. 2020;215(1):127-32. doi: 10.2214/AJR.20.23072. PubMed PMID: 32186894.
- Jaimes F, Farbiarz J, Alvarez D, Martínez C. Comparison between logistic regression and neural networks to predict death in patients with suspected sepsis in the emergency room. Crit Care. 2005;9(2):R150-6. doi: 10.1186/cc3054. PubMed PMID: 15774048. PubMed PMCID: PMC1175932.
- Deschepper M, Waegeman W, Vogelaers D, Eeckloo K. Using structured pathology data to predict hospital-wide mortality at admission. PLoS One. 2020;15(6):e0235117. doi: 10.1371/journal.pone.0235117. PubMed PMID: 32584872. PubMed PMCID: PMC7316243.
- Bhattacharya S, Rajan V, Shrivastava H. ICU mortality prediction: a classification algorithm for imbalanced datasets. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. 2017;31(1):1288–94.
- Li K, Wu J, Wu F, Guo D, Chen L, Fang Z, Li C. The clinical and chest CT features associated with severe and critical COVID-19 pneumonia. Invest Radiol. 2020;55(6):327-31. doi: 10.1097/RLI.0000000000000672. PubMed PMID: 32118615. PubMed PMCID: PMC7147273.
- Liguoro I, Pilotto C, Bonanni M, Ferrari ME, et al. SARS-COV-2 infection in children and newborns: a systematic review. Eur J Pediatr. 2020;179(7):1029-46. doi: 10.1007/s00431-020-03684-7. PubMed PMID: 32424745. PubMed PMCID: PMC7234446.
- Shi L, Wang Y, Wang Y, Duan G, Yang H. Dyspnea rather than fever is a risk factor for predicting mortality in patients with COVID-19. J Infect. 2020;81(4):647-79. doi: 10.1016/j.jinf.2020.05.013. PubMed PMID: 32417316. PubMed PMCID: PMC7228739.
- Verity R, Okell LC, Dorigatti I, Winskill P, et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect Dis. 2020;20(6):669-77. doi: 10.1016/S1473-3099(20)30243-7. PubMed PMID: 32240634. PubMed PMCID: PMC7158570.
|