Volume 17, Issue 3 (7-2025)                   IJDO 2025, 17(3): 182-192 | Back to browse issues page


XML Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Sefid F, Norouzi-Ghahjavarestani N, Soleymani-Tabasi M, Zarepour-Ahmadabadi J, Azamirad G, Vahidi Mehrjardi M Y, et al . Predicting Diabetes Risk Using Machine Learning: A Comparative Study on the Yazd Health Study (YaHS). IJDO 2025; 17 (3) :182-192
URL: http://ijdo.ssu.ac.ir/article-1-967-en.html
Abortion Research Centre, Yazd Reproductive Sciences Institute, Shahid Sadoughi University of Medical Sciences, Yazd, Iran. Meybod Genetic Research Center, Yazd, Iran.
Abstract:   (22 Views)
Diabetes is a chronic disease that can significantly affect health at the global level, highlighting the importance of accurate early risk prediction to support prevention and management efforts. This study aims to evaluate the effectiveness of some efficient machine learning algorithms: Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), and Decision Tree (DT) in diabetes risk prediction using dataset acquired from Yazd Health Study (YaHS). Extensive preprocessing steps, including data cleaning, class imbalance handling through Synthetic Minority Oversampling Technique and Edited Nearest Neighbors (SMOTEENN), and feature selection, are applied to enhance the performance of models. Among the evaluated machine learning algorithms, the Random Forest classifier achieved the highest performance with an accuracy of 97%, outperforming other methods in terms of predictive capability. The findings highlight the vital importance of effective data preprocessing and algorithm selection in developing reliable predictive models from healthcare datasets.
 
Full-Text [PDF 805 kb]   (8 Downloads)    
Type of Study: Research | Subject: Special
Received: 2025/05/29 | Accepted: 2025/06/30 | Published: 2025/07/28

References
1. World Heath Organization. 14 November 2024; Available from: https://www.who.int/en/news-room/fact-sheets/detail/diabetes.
2. Alkalifah B, Shaheen MT, Alotibi J, Alsubait T, Alhakami H. Evaluation of machine learning-based regression techniques for prediction of diabetes levels fluctuations. Heliyon. 2025;11(1):e41199. [DOI:10.1016/j.heliyon.2024.e41199]
3. Mathioudakis NN, Abusamaan MS, Shakarchi AF, Sokolinsky S, Fayzullin S, McGready J, et al. Development and validation of a machine learning model to predict near-term risk of iatrogenic hypoglycemia in hospitalized patients. JAMA Network Open. 2021;4(1):e2030913. [DOI:10.1001/jamanetworkopen.2020.30913]
4. Pei D, Yang T, Zhang C. Estimation of diabetes in a high-risk adult Chinese population using J48 decision tree model. Diabetes, Metabolic Syndrome and Obesity. 2020:4621-30. [DOI:10.2147/DMSO.S279329]
5. Kumar N, Singh P, Kumari S, Singh BK. Predicting Diabetes Using Machine Learning. In2023 5th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N) 2023 (pp. 1737-1742). [DOI:10.1109/ICAC3N60023.2023.10541436]
6. Mirzaei M, Salehi-Abargouei A, Mirzaei M, Mohsenpour MA. Cohort Profile: The Yazd Health Study (YaHS): a population-based study of adults aged 20-70 years (study design and baseline population data). International journal of epidemiology. 2018;47(3):697-8h. [DOI:10.1093/ije/dyx231]
7. Ahmadabadi JZ, Mehrjardi FZ, Ghanbary M, Mirzaei M. Identification of Effective Factors and Prediction of Ischemic Heart Disease Using Machine Learning Methods and Data from the Yazd Health Study (YaHS). Journal of Shahid Sadoughi University of Medical Sciences. 2024; 32(7): 8067-79.(in Persian) [DOI:10.18502/ssu.v32i7.16571]
8. Khosravi M, Azizi R, Fallahzadeh H, Mirzaei M. Prevalence, Incidence, and Risk Factors of Hypothyroidism in Adult Residents of Yazd Greater Area, 2015-2021: Results of Yazd Health Study. Iranian Journal of Medical Sciences. 2024;49(10):623.
9. Darand M, Golpour-Hamedani S, Karimi E, Hassanizadeh S, Mirzaei M, Arabi V, et al. The association between adherence to unhealthy plant-based diet and risk of COVID-19: a cross-sectional study. BMC Infectious Diseases. 2024;24(1):1-8. [DOI:10.1186/s12879-024-10115-7]
10. Han J, Kamber M, Pei J. Data mining: Concepts and. Techniques, Waltham: Morgan Kaufmann Publishers. 2012.
11. Muntasir Nishat M, Faisal F, Jahan Ratul I, Al-Monsur A, Ar-Rafi AM, Nasrullah SM, et al. A Comprehensive Investigation of the Performances of Different Machine Learning Classifiers With SMOTE‐ENN Oversampling Technique and Hyperparameter Optimization for Imbalanced Heart Failure Dataset. Scientific Programming. 2022;2022(1):3649406. [DOI:10.1155/2022/3649406]
12. Hasan MK, Alam MA, Das D, Hossain E, Hasan M. Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access. 2020;8:76516-31. [DOI:10.1109/ACCESS.2020.2989857]
13. Shipe ME, Deppen SA, Farjah F, Grogan EL. Developing prediction models for clinical use using logistic regression: an overview. Journal of thoracic disease. 2019;11(Suppl 4):S574. [DOI:10.21037/jtd.2019.01.25]
14. Ying LU. Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry. 2015;27(2):130.
15. Ooka T, Johno H, Nakamoto K, Yoda Y, Yokomichi H, Yamagata Z. Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan. BMJ Nutrition, Prevention & Health. 2021;4(1):140. [DOI:10.1136/bmjnph-2020-000200]
16. Khokhar PB, Gravino C, Palomba F. Advances in artificial intelligence for diabetes prediction: insights from a systematic literature review. Artificial intelligence in medicine. 2025:103132. [DOI:10.1016/j.artmed.2025.103132]
17. Edeh MO, Khalaf OI, Tavera CA, Tayeb S, Ghouali S, Abdulsahib GM, et al. A classification algorithm-based hybrid diabetes prediction model. Frontiers in Public Health. 2022;10:829519. [DOI:10.3389/fpubh.2022.829519]

Add your comments about this article : Your username or Email:
CAPTCHA

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2025 CC BY-NC 4.0 | Iranian Journal of Diabetes and Obesity

Designed & Developed by : Yektaweb