Using large language models for preprocessing and information extraction from unstructured text: A proof-of-concept application in the social sciences


Schwitter, Nicole


[img] PDF
schwitter-2025-using-large-language-models-for-preprocessing-and-information-extraction-from-unstructured-text-a-proof.pdf - Published

Download (196kB)

DOI: https://doi.org/10.1177/20597991251313876
URL: https://journals.sagepub.com/doi/10.1177/205979912...
URN: urn:nbn:de:bsz:180-madoc-688782
Document Type: Article
Year of publication Online: 2025
Date: 17 January 2025
The title of a journal, publication series: Methodological Innovations
Volume: tba
Issue number: tba
Place of publication: London
Publishing house: Sage Publishing
ISSN: 2059-7991
Publication language: English
Institution: Außerfakultäre Einrichtungen > Mannheim Centre for European Social Research - Research Department A
Pre-existing license: Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 300 Social sciences, sociology, anthropology
Abstract: Recent months have witnessed an increase in suggested applications for large language models (LLMs) in the social sciences. This proof-of-concept paper explores the use of LLMs to improve text quality and to extract predefined information from unstructured text. The study showcases promising results with an example focussed on historical newspapers and highlights the effectiveness of LLMs in correcting errors in the parsed text and in accurately extracting specified information. By leveraging the capabilities of LLMs in these straightforward, instruction-based tasks, this research note demonstrates their potential to improve on the efficiency and accuracy of text analysis workflows. The ongoing development of LLMs and the emergence of robust open-source options underscores their increasing accessibility for both, the quantitative and qualitative, social sciences and other disciplines working with text data.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Diese Publikation ist bisher nur Online erschienen. Diese Publikation nun als "Jetzt in Print erschienen" melden.




Metadata export


Citation


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item