Cluster-Analyse , Streuungsmaß , Lebenslauf , Rentenversicherung , Deutschland
Freie Schlagwörter (Englisch):
clustering , measures of association , discrete data , time series
Abstract:
A new algorithm for clustering life course trajectories is presented and tested with large register data. Life courses are represented as sequences on a monthly timescale for the working-life with an age span from 16–65. A meaningful clustering result for this kind of data provides interesting subgroups with similar life course trajectories. The high sampling rate allows precise discrimination of the different subgroups, but it produces a lot of highly correlated data for phases with low variability. The main challenge is to select the variables (points in time) that carry most of the relevant information. The new algorithm deals with this problem by simultaneously clustering and identifying critical junctures for each of the relevant subgroups. The developed divisive algorithm is able to handle large amounts of data with multiple dimensions within reasonable time. This is demonstrated on data from the Federal German pension insurance.
Zusätzliche Informationen:
Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.