A Comparative Analysis of the Decision Tree and Random Forest Algorithms in Diabetes Classification
Keywords:
Diabetes, Machine Learning, Decision Tree, Random Forest, Medical PredictionAbstract
Diabetes is a global health issue that requires an accurate early detection system to prevent chronic complications. This study aims to analyze and compare the performance of two machine learning algorithms Decision Tree and Random Forest in predicting the risk of diabetes. The research methodology uses the Pima Indians Diabetes secondary dataset from Kaggle, which was processed through data preprocessing stages, including handling missing values and feature standardization using StandardScaler. Model evaluation was conducted by measuring accuracy, precision, recall, and F1-score metrics. The analysis results show that the Decision Tree algorithm delivers the most optimal performance with an accuracy rate of 76%. The research findings confirm that glucose and body mass index (BMI) features have the most significant influence on prediction accuracy. It is hoped that the contributions of this research can serve as a reference in the development of an efficient clinical decision support system for early diabetes screening based on computational data.


