Predicting students' academic achievement using methods of educational data mining

Alturki, Sarah

[img] PDF
thesisSarah (submitted version).pdf - Published

Download (5MB)

URN: urn:nbn:de:bsz:180-madoc-636741
Document Type: Doctoral dissertation
Year of publication: 2022
Place of publication: Mannheim
University: Universität Mannheim
Evaluator: Stuckenschmidt, Heiner
Date of oral examination: 19 December 2022
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
License: CC BY 4.0 Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 004 Computer science, internet
Keywords (English): student performance , machine learning , educational data mining , students' dropout , predictions , imbalanced dataset , oversampling methods
Abstract: The tremendous growth in educational data forms the need to have meaningful information produced from it. Educational Data Mining (EDM) has become an exciting research area that can reveal valuable knowledge from educational databases. This knowledge can be used for many purposes, including identifying dropouts or weak students who need special attention and discovering extraordinary students who can be presented with lifetime opportunities. This thesis allows the reader to grasp the field of EDM from all its angles, with more details on academic prediction tasks. It provides a comprehensive background for understanding EDM and discusses the different methods and applications of data mining in education. It also provides a rich literature review on predicting students’ academic achievement and covers related works from 2007 to 2022. Furthermore, it examines the application of machine learning algorithms to predict students’ academic achievement on two diverse datasets. The first dataset has been obtained from the Computer and Information Science College at Princess Norah University (PNU) in Riyadh, Saudi Arabia. In this work, 300 undergraduate students’ records have been used to predict their final academic achievement. We used the Weka software to compare the performance of eight data mining algorithms in predicting students’ academic achievement. Those algorithms are C4.5, Simple CART, LADTree, Support Vector Machine, Naïve Bayes, K-nearest-Neighbor, Artificial Neural Networks, and Random Forest and validated the models using 10-folds cross-validation. The empirical results show that: (i) In the College of Computer and Information Science, the following features are the most essential to predict student academic achievement: the student GPA in each semester, the number of failed courses during the first four semesters, and the grades of three core courses. On the other hand, student's proficiency in English and the number of registered credit hours do not play a major role in their success (ii) Naïve Base performs the best in predicting students’ achievement followed by Random Forest; (iii) Students who attend an orientation year do not have a greater chance of success at that college. The second dataset represents the records of the Business Informatics master's students at the University of Mannheim in Germany. In this work, more than 700 undergraduate students’ data have been used to predict their final academic achievement using different machine learning libraries in python. We compared the performance of nine data mining algorithms in predicting students’ academic achievement. Those algorithms are Logistic Regression, Naïve Bayes, K-nearest neighbor, Artificial Neural Networks, Support Vector Machine, Random Forest, Gradient Boosting, Light Gradient Boosting, and Extreme Gradient Boosting and validated the models using 10-folds cross-validation. The empirical results show the following: (i) Bagging and Boosting algorithms produce a better predictive performance as compared to individual classifiers, and (ii) the semesters’ grades are the most significant features for the predictive model, followed by students’ culture and distance from students’ accommodation to university campus. The outcomes of the two studies can be used to design a recommender system that enables timely interventions for the undergraduate students of the College of Information and Computer Science and the postgraduate students of the Business Informatics program.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item