Predicting students' academic achievement using methods of educational data mining


Alturki, Sarah


[img] PDF
thesisSarah (submitted version).pdf - Veröffentlichte Version

Download (5MB)

URN: urn:nbn:de:bsz:180-madoc-636741
Dokumenttyp: Dissertation
Erscheinungsjahr: 2022
Ort der Veröffentlichung: Mannheim
Hochschule: Universität Mannheim
Gutachter: Stuckenschmidt, Heiner
Datum der mündl. Prüfung: 19 Dezember 2022
Sprache der Veröffentlichung: Englisch
Einrichtung: Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
Lizenz: CC BY 4.0 Creative Commons Namensnennung 4.0 International (CC BY 4.0)
Fachgebiet: 004 Informatik
Freie Schlagwörter (Englisch): student performance , machine learning , educational data mining , students' dropout , predictions , imbalanced dataset , oversampling methods
Abstract: The tremendous growth in educational data forms the need to have meaningful information produced from it. Educational Data Mining (EDM) has become an exciting research area that can reveal valuable knowledge from educational databases. This knowledge can be used for many purposes, including identifying dropouts or weak students who need special attention and discovering extraordinary students who can be presented with lifetime opportunities. This thesis allows the reader to grasp the field of EDM from all its angles, with more details on academic prediction tasks. It provides a comprehensive background for understanding EDM and discusses the different methods and applications of data mining in education. It also provides a rich literature review on predicting students’ academic achievement and covers related works from 2007 to 2022. Furthermore, it examines the application of machine learning algorithms to predict students’ academic achievement on two diverse datasets. The first dataset has been obtained from the Computer and Information Science College at Princess Norah University (PNU) in Riyadh, Saudi Arabia. In this work, 300 undergraduate students’ records have been used to predict their final academic achievement. We used the Weka software to compare the performance of eight data mining algorithms in predicting students’ academic achievement. Those algorithms are C4.5, Simple CART, LADTree, Support Vector Machine, Naïve Bayes, K-nearest-Neighbor, Artificial Neural Networks, and Random Forest and validated the models using 10-folds cross-validation. The empirical results show that: (i) In the College of Computer and Information Science, the following features are the most essential to predict student academic achievement: the student GPA in each semester, the number of failed courses during the first four semesters, and the grades of three core courses. On the other hand, student's proficiency in English and the number of registered credit hours do not play a major role in their success (ii) Naïve Base performs the best in predicting students’ achievement followed by Random Forest; (iii) Students who attend an orientation year do not have a greater chance of success at that college. The second dataset represents the records of the Business Informatics master's students at the University of Mannheim in Germany. In this work, more than 700 undergraduate students’ data have been used to predict their final academic achievement using different machine learning libraries in python. We compared the performance of nine data mining algorithms in predicting students’ academic achievement. Those algorithms are Logistic Regression, Naïve Bayes, K-nearest neighbor, Artificial Neural Networks, Support Vector Machine, Random Forest, Gradient Boosting, Light Gradient Boosting, and Extreme Gradient Boosting and validated the models using 10-folds cross-validation. The empirical results show the following: (i) Bagging and Boosting algorithms produce a better predictive performance as compared to individual classifiers, and (ii) the semesters’ grades are the most significant features for the predictive model, followed by students’ culture and distance from students’ accommodation to university campus. The outcomes of the two studies can be used to design a recommender system that enables timely interventions for the undergraduate students of the College of Information and Computer Science and the postgraduate students of the Business Informatics program.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadaten-Export


Zitation


+ Suche Autoren in

+ Download-Statistik

Downloads im letzten Jahr

Detaillierte Angaben



Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail


Actions (login required)

Eintrag anzeigen Eintrag anzeigen