Pre-trained nonresponse prediction in panel surveys with machine learning


Collins, John ; Kern, Christoph


[img] PDF
srm_8473_OnlinePDF.pdf - Published

Download (1MB)

DOI: https://doi.org/10.18148/srm/2025.v19i2.8473
URL: https://ojs.ub.uni-konstanz.de/srm/article/view/84...
Additional URL: https://www.researchgate.net/publication/394429972...
URN: urn:nbn:de:bsz:180-madoc-706556
Document Type: Article
Year of publication: 2025
The title of a journal, publication series: Survey Research Methods : SRM
Volume: 19
Issue number: 2
Page range: 123-137
Place of publication: Konstanz
Publishing house: European Survey Research Assoc.
ISSN: 1864-3361
Publication language: English
Institution: Außerfakultäre Einrichtungen > Mannheim Centre for European Social Research - Research Department A
Pre-existing license: Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 310 Statistics
Keywords (English): panel surveys , machine learning , nonresponse , attrition; prediction
Abstract: While predictive modeling for unit nonresponse in panel surveys has been explored in various contexts, it is still under-researched how practitioners can best adopt these techniques. Currently, practitioners need to wait until they accumulate enough data in their panel to train and evaluate their own modeling options. This paper presents a novel “cross-training” technique in which we show that the indicators of nonresponse are so ubiquitous across studies that it is viable to train a model on one panel study and apply it to a different one. The practical benefit of this approach is that newly commencing panels can potentially make better nonresponse predictions in the early waves because these pre-trained models make use of more data. We demonstrate this technique with five panel surveys which encompass a variety of survey designs: the Socio-Economic Panel (SOEP), the German Internet Panel (GIP), the GESIS Panel, the Mannheim Corona Study (MCS), and the Family Demographic Panel (FREDA). We demonstrate that nonresponse history and demographics, paired with tree-based modeling methods, make highly accurate and generalizable predictions across studies, despite differences in panel design. We show how cross-training can effectively predict nonresponse in early panel waves where attrition is typically highest.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadata export


Citation


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item