New data sources in social science research: Things to know before working with Reddit data


Amaya, Ashley ; Bach, Ruben L. ; Keusch, Florian ; Kreuter, Frauke



DOI: https://doi.org/10.1177/0894439319893305
URL: https://journals.sagepub.com/doi/full/10.1177/0894...
Document Type: Article
Year of publication Online: 2019
The title of a journal, publication series: Social Science Computer Review : SSCORE
Page range: 1-10
Place of publication: Thousand Oaks, CA ; London
Publishing house: Sage
ISSN: 0894-4393 , 1552-8286
Publication language: English
Institution: School of Social Sciences > Statistik u. Sozialwiss. Methodenlehre (Juniorprofessur) (Keusch 2016-)
Subject: 004 Computer science, internet
300 Social sciences, sociology, anthropology
Abstract: Social media are becoming more popular as a source of data for social science researchers. These data are plentiful and offer the potential to answer new research questions at smaller geographies and for rarer subpopulations. When deciding whether to use data from social media, it is useful to learn as much as possible about the data and its source. Social media data have properties quite different from those with which many social scientists are used to working, so the assumptions often used to plan and manage a project may no longer hold. For example, social media data are so large that they may not be able to be processed on a single machine; they are in file formats with which many researchers are unfamiliar, and they require a level of data transformation and processing that has rarely been required when using more traditional data sources (e.g., survey data). Unfortunately, this type of information is often not obvious ahead of time as much of this knowledge is gained through word-of-mouth and experience. In this article, we attempt to document several challenges and opportunities encountered when working with Reddit, the self-proclaimed “front page of the Internet” and popular social media site. Specifically, we provide descriptive information about the Reddit site and its users, tips for using organic data from Reddit for social science research, some ideas for conducting a survey on Reddit, and lessons learned in merging survey responses with Reddit posts. While this article is specific to Reddit, researchers may also view it as a list of the type of information one may seek to acquire prior to conducting a project that uses any type of social media data.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Diese Publikation ist bisher nur Online erschienen. Diese Publikation nun als "Jetzt in Print erschienen" melden.




+ Citation Example and Export

Amaya, Ashley ; Bach, Ruben L. ORCID: 0000-0001-5690-2829 ; Keusch, Florian ORCID: 0000-0003-1002-4092 ; Kreuter, Frauke ORCID: 0000-0002-7339-2645 (2019) New data sources in social science research: Things to know before working with Reddit data. Social Science Computer Review : SSCORE Thousand Oaks, CA ; London 1-10 [Article]


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item