Investigating the importance of demographic features for EDM-predictions
Cohausz, Lea
;
Tschalzev, Andrej
;
Bartelt, Christian
;
Stuckenschmidt, Heiner
URL:
|
https://educationaldatamining.org/EDM2023/proceedi...
|
Document Type:
|
Conference or workshop publication
|
Year of publication:
|
2023
|
Book title:
|
Proceedings of the 16th International Conference on Educational Data Mining
|
Page range:
|
125-136
|
Conference title:
|
16th Educational Data Mining Conference
|
Location of the conference venue:
|
Bengaluru, India
|
Date of the conference:
|
11.-14.07.2023
|
Publisher:
|
Feng, Mingyu
;
Käser, Tanja
;
Talukdar, Partha
|
Place of publication:
|
Bengaluru, India
|
Publishing house:
|
International Educational Data Mining Society
|
Publication language:
|
English
|
Institution:
|
School of Business Informatics and Mathematics > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
|
Subject:
|
004 Computer science, internet
|
Abstract:
|
Demographic features are commonly used in Educational Data Mining (EDM) research to predict at-risk students. Yet, the practice of using demographic features has to be considered extremely problematic due to the data’s sensitive nature, but also because (historic and representation) biases likely exist in the training data, which leads to strong fairness concerns. At the same time and despite the frequent use, the value of demographic features for prediction accuracy remains unclear. In this paper, we systematically investigate the importance of demographic features for at-risk prediction using several publicly available datasets from different countries. We find strong evidence that including demographic features does not lead to better-performing models as long as some study-related features exist, such as performance or activity data. Additionally, we show that models, nonetheless, place importance on these features when they are included in the data – although this is not necessary for accuracy. These findings, together with our discussion, strongly suggest that at-risk prediction should not include demographic features. Our code is available at: https://anonymous.4open.science/r/edm-F7D1.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Search Authors in
BASE:
Cohausz, Lea
;
Tschalzev, Andrej
;
Bartelt, Christian
;
Stuckenschmidt, Heiner
Google Scholar:
Cohausz, Lea
;
Tschalzev, Andrej
;
Bartelt, Christian
;
Stuckenschmidt, Heiner
ORCID:
Cohausz, Lea ; Tschalzev, Andrej ORCID: 0000-0002-0638-5744 ; Bartelt, Christian ; Stuckenschmidt, Heiner ORCID: 0000-0002-0209-3859
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
|
Show item |
|
|