Predicting and Analyzing Student Absenteeism Using Machine Learning Algorithm
Introduction. In a developed society, the state should invest in the education of the younger generation. In less developed countries, Albania included, there are no nation-wide studies to show the factors that affect the lack of students in classrooms. The purpose of this study is to predict, analyze, and evaluate the possible causes of student absenteeism using machine learning algorithms. The attributes taken into account in this study are related to the family, demographic, social, university, and personal aspects according to academic criteria.
Materials and Methods. Student absenteeism covers any student that has not attended class, irrespective of the reason. The data set consists of 26 attributes and 210,000 records corresponding to the teaching hours of 500 students during an academic year at Faculty of Information Technology. The students participating in the survey range from 18 to 25 years of age of both genders. The compilation of the student questionnaire was based on reviewing the literature and analyzing 26 attributes that we categorized into 5 groups included in the questionnaire.
Results. This paper provides knowledge in the analysis and evaluation of factors that lead students to miss lectures using machine learning. It is important to note that this study was conducted on students of this faculty, and as such, the results may not be generalized to all universities. That’s why, researchers are encouraged to test the results achieved in this paper on other clusters.
Discussion and Conclusion. The paper provides recommendations based on the findings by offering different problem-solving strategies. The questionnaire used only for 500 Faculty of Information Technology students can be widely applied in any educational institution in the region. However, the results of this study cannot be generalized for the student and youth population of other regions or other countries. This paper provides an original and easily usable questionnaire suitable to various study programs and universities.
Keywords: student absenteeism, family, demographic, social, university, personal aspects, data mining, machine learning
Conflict of interests: The authors declare no conflict of interest.
For citation: Mukli L., Rista A. Predicting and Analyzing Student Absenteeism Using Machine Learning Algorithm. Integration of Education. 2022;26(2):216‒228. doi: https://doi.org/10.15507/1991-9468.107.026.202202.216-228
All authors have read and approved the final manuscript.
Submitted 25.08.2021; approved after reviewing 10.01.2022;
accepted for publication 17.01.2022.
Contribution of the authors:
L. Mukli – contributed to the conception of absenteeism measurement for the faculty and the overall perception of the survey results.
A. Rista – contributed to the statistical analysis and dataset construction.
This work is licensed under a Creative Commons Attribution 4.0 License.