Enhancing Stroke Diagnosis with Machine Learning and SHAP-Based Explainable AI Models

Authors

DOI:

https://doi.org/10.30983/knowbase.v4i2.8720

Keywords:

Algorithm insight, Explainable AI, Machine Learning, SHAP Algorithm, Stroke detection

Abstract

Stroke is a serious illness that needs to be treated quickly to enhance patient outcome. Machine Learning (ML) offers promising potential for automated stroke detection through precise neuroimaging analysis. Although existing research has explored ML applications in stroke medicine, challenges remain, such as validation concerns and limitations within available datasets. The study aims to compare ML models and SHapley Additive exPlanations (SHAP) algorithm insights for stroke detection optimization. The research evaluates classifiers' performance, including Deep Neural Networks (DNN), AdaBoost, Support Vector Machines (SVM), and XGBoost, using data from www.kaggle.com. Results demonstrate XGBoost's superior performance across various data splits, emphasizing its effectiveness for stroke prediction. Utilizing SHAP provides deeper insights into stroke risk factors, facilitating comprehensive risk assessment. Overall, the study contributes to advancing stroke detection methodologies and highlights ML's role in enhancing clinical practice in stroke medicine. Further research could explore additional datasets and advanced ML algorithms to enhance prediction accuracy and preventive measures.

References

S. A. Sheth, L. Giancardo, M. Colasurdo, V. M. Srinivasan, A. Niktabe, and P. Kan, “Machine learning and acute stroke imaging,” J. Neurointerv. Surg., vol. 15, no. 2, pp. 195–199, 2022, doi: 10.1136/neurintsurg-2021-018142.

M. A. Saleem et al., “Innovations in Stroke Identification: A Machine Learning-Based Diagnostic Model Using Neuroimages,” IEEE Access, vol. 12, no. February, 2024, doi: 10.1109/ACCESS.2024.3369673.

K. H. Yeh, “A Secure IoT-Based Healthcare System with Body Sensor Networks,” IEEE Access, vol. 4, pp. 10288–10299, 2016, doi: 10.1109/ACCESS.2016.2638038.

V. Bandi, D. Bhattacharyya, and D. Midhunchakkravarthy, “Prediction of brain stroke severity using machine learning,” Rev. d’Intelligence Artif., vol. 34, no. 6, pp. 753–761, 2020, doi: 10.18280/RIA.340609.

K. Mridha, S. Ghimire, J. Shin, A. Aran, M. M. Uddin, and M. F. Mridha, “Automated Stroke Prediction Using Machine Learning: An Explainable and Exploratory Study With a Web Application for Early Intervention,” IEEE Access, vol. 11, no. April, pp. 52288–52308, 2023, doi: 10.1109/ACCESS.2023.3278273.

S. Mainali, M. E. Darsie, and K. S. Smetana, “Machine Learning in Action: Stroke Diagnosis and Outcome Prediction,” Front. Neurol., vol. 12, no. December, 2021, doi: 10.3389/fneur.2021.734345.

S. Rahman, M. Hasan, and A. K. Sarkar, “Prediction of Brain Stroke using Machine Learning Algorithms and Deep Neural Network Techniques,” Eur. J. Electr. Eng. Comput. Sci., vol. 7, no. 1, pp. 23–30, 2023, doi: 10.24018/ejece.2023.7.1.483.

G. Singh et al., “Role of Machine Learning in Acute Stroke Imaging : A Technical Review,” no. Ml, 2019.

P. Yogendra Prasad, M. Ramu, K. Anitha, K. Lalasa, D. Hasritha, and B. A. Reddy, “Brain Stroke Detection Through Advanced Machine Learning and Enhanced Algorithms,” 2024 Int. Conf. Recent Adv. Electr. Electron. Ubiquitous Commun. Comput. Intell. RAEEUCCI 2024, 2024, doi: 10.1109/RAEEUCCI61380.2024.10547987.

M. Lee et al., “Machine learning-based prediction of post-stroke cognitive status using electroencephalography-derived brain network attributes,” Front. Aging Neurosci., vol. 15, no. September, pp. 1–10, 2023, doi: 10.3389/fnagi.2023.1238274.

R. Garg, E. Oh, A. Naidech, K. Kording, and S. Prabhakaran, “Automating Ischemic Stroke Subtype Classification Using Machine Learning and Natural Language Processing,” J. Stroke Cerebrovasc. Dis., vol. 28, no. 7, pp. 2045–2051, 2019, doi: 10.1016/j.jstrokecerebrovasdis.2019.02.004.

S. Ruksakulpiwat et al., “Machine learning-based patient classification system for adults with stroke: A systematic review,” Chronic Illn., vol. 19, no. 1, pp. 26–39, 2023, doi: 10.1177/17423953211067435.

F. Kremers et al., “Outcome Prediction Models for Endovascular Treatment of Ischemic Stroke: Systematic Review and External Validation,” Stroke, vol. 29, no. 2, pp. 825–836, 2022, doi: 10.1161/STROKEAHA.120.033445.

T. I. Shoily, T. Islam, S. Jannat, S. A. Tanna, T. M. Alif, and R. R. Ema, “Detection of Stroke Disease using Machine Learning Algorithms,” 2019 10th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2019, pp. 6–11, 2019, doi: 10.1109/ICCCNT45670.2019.8944689.

E. Dritsas and M. Trigka, “Stroke Risk Prediction with Machine Learning Techniques,” Sensors, vol. 22, no. 13, 2022, doi: 10.3390/s22134670.

Nikita and G. Parashar, “Brain Stroke Detection and Prediction Using Machine Learning Approach: A Cloud Deployment Perspective,” Proc. Int. Conf. Circuit Power Comput. Technol. ICCPCT 2023, no. August 2023, pp. 1705–1714, 2023, doi: 10.1109/ICCPCT58313.2023.10245699.

H. Cheng, M. Zhang, and J. Q. Shi, “A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 8, pp. 1–30, 2024, doi: 10.1109/TPAMI.2024.3447085.

Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 904, pp. 23–37, 1995, doi: 10.1007/3-540-59119-2_166.

L. Wang, M. Sugiyama, C. Yang, Z. H. Zhou, and J. Feng, “On the margin explanation of boosting algorithms,” 21st Annu. Conf. Learn. Theory, COLT 2008, pp. 479–490, 2008.

T. He, T. Wang, R. Abbey, and J. Griffin, “High-Performance Support Vector Machines and Its Applications,” in ICDATA 2018, 2019, pp. 1–7, [Online]. Available: http://arxiv.org/abs/1905.00331.

H. A. Park, “An introduction to logistic regression: From basic concepts to interpretation with particular attention to nursing domain,” J. Korean Acad. Nurs., vol. 43, no. 2, pp. 154–164, 2013, doi: 10.4040/jkan.2013.43.2.154.

T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., vol. 13-17-Augu, pp. 785–794, 2016, doi: 10.1145/2939672.2939785.

Á. Delgado-Panadero, B. Hernández-Lorca, M. T. García-Ordás, and J. A. Benítez-Andrades, “Implementing local-explainability in Gradient Boosting Trees: Feature Contribution,” Inf. Sci. (Ny)., vol. 589, pp. 199–212, 2022, doi: 10.1016/j.ins.2021.12.111.

C.-H. Chang, R. Caruana, and A. Goldenberg, “NODE-GAM: Neural Generalized Additive Model For Interpretable Deep Learning,” in ICLR 2022, 2022, pp. 1–25.

P. Chuks, “Diabetes, Hypertension and Stroke Prediction,” 2015. https://www.kaggle.com/datasets/prosperchuks/health-dataset.

H. P. Vinutha, B. Poornima, and B. M. Sagar, “Detection of outliers using interquartile range technique from intrusion dataset,” Adv. Intell. Syst. Comput., vol. 701, pp. 511–518, 2018, doi: 10.1007/978-981-10-7563-6_53.

C. Cortes and V. Vapnik, “Support Vector Networks,” Mach. Learn., vol. 20, pp. 273–297, 1995, [Online]. Available: http://image.diku.dk/imagecanon/material/cortes_vapnik95.pdf.

F. F. Sherif and K. S. Ahmed, “A Machine Learning Approach for Stroke Differential Diagnosis by Blood Biomarkers,” J. Adv. Inf. Technol., vol. 15, no. 1, pp. 1–9, 2024, doi: 10.12720/jait.15.1.1-9.

D. C. Feng et al., “Machine learning-based compressive strength prediction for concrete: An adaptive boosting approach,” Constr. Build. Mater., vol. 230, no. January, 2020, doi: 10.1016/j.conbuildmat.2019.117000.

T. Raj Ojha and A. Kumar Jha, “Analyzing the Performance of the Machine Learning Algorithms for Stroke Detection,” Int. J. Educ. Manag. Eng., vol. 13, no. 2, pp. 27–35, 2023, doi: 10.5815/ijeme.2023.02.04.

T. J. Hastie and R. J. Tibshirani, “Hastie T.J., Tibshirani R.J. - Generalized Additive Models (CRC,1990)(175d)0412343908.pdf,” New York: Chapman and Hall., vol. 1, no. 3. pp. 297–310, 1990.

S. M. Piryonesi and T. E. El-Diraby, “Data Analytics in Asset Management: Cost-Effective Prediction of the Pavement Condition Index,” J. Infrastruct. Syst., vol. 26, no. 1, 2020, doi: 10.1061/(asce)is.1943-555x.0000512.

S. M. Lundberg and S. I. Lee, “A unified approach to interpreting model predictions,” Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Section 2, pp. 4766–4775, 2017.

F. Fumagalli, M. Muschalik, P. Kolpaczki, E. Hullermeier, and B. Hammer, “SHAP-IQ: Unifed Approximation of any-order Shapley Interactions,” in 37th Conference on Neural Information Processing Systems (NeurIPS 2023), 2023, pp. 1–37.

R. O. Alabi, M. Elmusrati, I. Leivo, A. Almangush, and A. A. Mäkitie, “Machine learning explainability in nasopharyngeal cancer survival using LIME and SHAP,” Sci. Rep., vol. 13, no. 1, pp. 1–14, 2023, doi: 10.1038/s41598-023-35795-0.

A. Temenos, N. Temenos, M. Kaselimi, A. Doulamis, and N. Doulamis, “Interpretable Deep Learning Framework for Land Use and Land Cover Classification in Remote Sensing Using SHAP,” IEEE Geosci. Remote Sens. Lett., vol. 20, pp. 0–4, 2023, doi: 10.1109/LGRS.2023.3251652.

Z. G. Al-Mekhlafi et al., “Deep Learning and Machine Learning for Early Detection of Stroke and Haemorrhage,” Comput. Mater. Contin., vol. 72, no. 1, pp. 775–796, 2022, doi: 10.32604/cmc.2022.024492.

M. R. Fahlevvi, “Sentiment Analysis And Topic Modeling on User Reviews of Online Tutoring Applications Using Support Vector Machine and Latent Dirichlet Allocation,” Knowbase Int. J. Knowl. Database, vol. 2, no. 2, p. 142, 2022, doi: 10.30983/knowbase.v2i2.5906.

Y. Lizar, A. S. Firrizqi, A. Guci, and J. Sunadi, “Data Mining Analysis to Predict Student Skills Using Naïve Bayes Method,” Knowbase Int. J. Knowl. Database, vol. 3, no. 2, p. 150, 2023, doi: 10.30983/knowbase.v3i2.7481.

N. Sulistianingsih and G. H. Martono, “Enhancing Predictive Models: An In-depth Analysis of Feature Selection Techniques Coupled with Boosting Algorithms,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 23, no. 2, pp. 353–364, 2024, doi: 10.30812/matrik.v23i2.3788.

S. K. UmaMaheswaran, F. Ahmad, R. Hegde, A. M. Alwakeel, and S. Rameem Zahra, “Enhanced non-contrast computed tomography images for early acute stroke detection using machine learning approach,” Expert Syst. Appl., vol. 240, no. November 2023, 2024, doi: 10.1016/j.eswa.2023.122559.

Downloads

Submitted

2024-11-07

Accepted

2025-01-09

Published

2024-12-31