The value of publicly available, textual and non-textual information for startup performance prediction


Kaiser, Ulrich ; Kuhn, Johan M.


[img]
Vorschau
PDF
dp20012.pdf - Veröffentlichte Version

Download (375kB)

URL: https://madoc.bib.uni-mannheim.de/55086
URN: urn:nbn:de:bsz:180-madoc-550867
Dokumenttyp: Arbeitspapier
Erscheinungsjahr: 2020
Titel einer Zeitschrift oder einer Reihe: ZEW Discussion Papers
Band/Volume: 20-012
Ort der Veröffentlichung: Mannheim
Sprache der Veröffentlichung: Englisch
Einrichtung: Sonstige Einrichtungen > ZEW - Leibniz-Zentrum für Europäische Wirtschaftsforschung
MADOC-Schriftenreihe: Veröffentlichungen des ZEW (Leibniz-Zentrum für Europäische Wirtschaftsforschung) > ZEW Discussion Papers
Fachgebiet: 330 Wirtschaft
Fachklassifikation: JEL: L26 , C53,
Freie Schlagwörter (Englisch): Startup , performance , prediction , text as data
Abstract: Can publicly available, web-scraped data be used to identify promising business startups at an early stage? To answer this question, we use such textual and non-textual information about the names of Danish firms and their addresses as well as their business purpose statements (BPSs) supplemented by core accounting information along with founder and initial startup characteristics to forecast the performance of newly started enterprises over a five years' time horizon. The performance outcomes we consider are involuntary exit, above-average employment growth, a return on assets of above 20 percent, new patent applications and participation in an innovation subsidy program. Our first key finding is that our models predict startup performance with either high or very high accuracy with the exception of high returns on assets where predictive power remains poor. Our second key finding is that the data requirements for predicting performance outcomes with such accuracy are low. To forecast the two innovation-related performance outcomes well, we only need to include a set of variables derived from the BPS texts while an accurate prediction of startup survival and high employment growth needs the combination of (i) information derived from the names of the startups, (ii) data on elementary founder-related characteristics and (iii) either variables describing the initial characteristics of the startup (to predict startup survival) or business purpose statement information (to predict high employment growth). These sets of variables are easily obtainable since the underlying information is mandatory to report upon business registration. The substantial accuracy of our predictions for survival, employment growth, new patents and participation in innovation subsidy programs indicates ample scope for algorithmic scoring models as an additional pillar of funding and innovation support decisions.




Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadaten-Export


Zitation


+ Suche Autoren in

+ Download-Statistik

Downloads im letzten Jahr

Detaillierte Angaben



Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail


Actions (login required)

Eintrag anzeigen Eintrag anzeigen