Masters Student of Educational Psychology, Educational Psycholoy Department, Shabestar Branch Islamic Azad University, Shabestar Iran
Abstract
Student dropout is a fundamental challenge for education systems worldwide, imposing significant economic and social consequences on both society and individuals. With the development of modern technologies and the widespread use of educational information systems, a vast amount of educational data is generated, enabling advanced analytics. Big data and machine learning techniques have created unprecedented opportunities for the early identification of students at risk of dropping out. The main objective of this study is to develop a comprehensive model for predicting student dropout using big data analysis and machine learning algorithms. The research questions are: 1) Which factors have the most significant impact on the risk of dropping out? 2) Which machine learning algorithm performs best in predicting dropout? 3) How can an effective early warning system be designed? This research was conducted using a quantitative, descriptive-correlational design. The sample consisted of 450 students from various Iranian universities, selected through stratified random sampling. The data included demographic variables, academic performance, attendance rates, participation levels, and socio-economic factors. Six machine learning algorithms—Logistic Regression, Random Forest, Support Vector Machine, Naive Bayes, Neural Network, and Decision Tree—were used for data analysis. The models were validated using 10-fold cross-validation. The results showed that the Naive Bayes algorithm performed best with an accuracy of 92.4%, precision of 89.7%, and sensitivity of 94.2%. The most important predictors of dropout were current GPA (0.284), attendance rate (0.195), participation score (0.148), and weekly study hours (0.117). A significant negative correlation was observed between current GPA and dropout risk (-0.674). Regression analysis indicated that all main variables except gender had a significant effect on dropout risk. This study demonstrates that using big data and machine learning algorithms can be a powerful tool for predicting dropout and identifying at-risk students. The results contribute to the development of early warning systems that allow for timely intervention and dropout prevention. The practical implications of this research include improving educational policymaking, optimizing resource allocation, and enhancing the quality of student support services.
Asgharan,L. (2025). Analysis of Big Data in Education to Predict Dropout and Identify At-Risk Students. (e225778). Management Research and Development, (), e225778
MLA
Asgharan,L. . "Analysis of Big Data in Education to Predict Dropout and Identify At-Risk Students" .e225778 , Management Research and Development, , , 2025, e225778.
HARVARD
Asgharan L. (2025). 'Analysis of Big Data in Education to Predict Dropout and Identify At-Risk Students', Management Research and Development, (), e225778.
CHICAGO
L. Asgharan, "Analysis of Big Data in Education to Predict Dropout and Identify At-Risk Students," Management Research and Development, (2025): e225778,
VANCOUVER
Asgharan L. Analysis of Big Data in Education to Predict Dropout and Identify At-Risk Students. Management Research and Development, 2025; (): e225778.