TY - JOUR
T1 - Prediction of university desertion through hybridization of classification algorithms
AU - Rocha, Carol Francia
AU - Zelaya, Yuliana Flores
AU - Sánchez, David Mauricio
AU - Pérez, Armando Fermín
N1 - Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2017
Y1 - 2017
N2 - At present time, the problem of university desertion in Peru is a social phenomenon that involves loss of Peruvian public investment in higher education (not less than a hundred of millions of dollars per year) and also the investment of their parents. For that reason, the aim of this research is to develop a prediction modeling of the dropout of Peruvian university students that allows us to identify those at greater risk to leave their studies, and giving a possibility to take preventive measures which help to maintain the rate of desertion and in the long term it might be reduced. In relation to the solution, we have identified the most influential factors (twenty-four). Additionally, the methodology used was KDD, and we worked with three classification algorithms: Naive Bayes, Multilayer Perceptron and C4.5 Decision Tree separately, and at the same time forming a hybrid prediction algorithm. Each algorithm has chosen based on its greater frequency of use in diverse researches, and its high precision in the prediction. The case study was the School of Systems Engineering of the National University of San Marcos; we used 840 student data from 2008 to 2013.
AB - At present time, the problem of university desertion in Peru is a social phenomenon that involves loss of Peruvian public investment in higher education (not less than a hundred of millions of dollars per year) and also the investment of their parents. For that reason, the aim of this research is to develop a prediction modeling of the dropout of Peruvian university students that allows us to identify those at greater risk to leave their studies, and giving a possibility to take preventive measures which help to maintain the rate of desertion and in the long term it might be reduced. In relation to the solution, we have identified the most influential factors (twenty-four). Additionally, the methodology used was KDD, and we worked with three classification algorithms: Naive Bayes, Multilayer Perceptron and C4.5 Decision Tree separately, and at the same time forming a hybrid prediction algorithm. Each algorithm has chosen based on its greater frequency of use in diverse researches, and its high precision in the prediction. The case study was the School of Systems Engineering of the National University of San Marcos; we used 840 student data from 2008 to 2013.
KW - Data mining
KW - Desertion factors
KW - Prediction
KW - University dropout
UR - http://www.scopus.com/inward/record.url?scp=85040550811&partnerID=8YFLogxK
M3 - Artículo de la conferencia
AN - SCOPUS:85040550811
SN - 1613-0073
VL - 2029
SP - 215
EP - 222
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 4th Annual International Symposium on Information Management and Big Data, SIMBig 2017
Y2 - 4 September 2017 through 6 September 2017
ER -