Enhancing theory-informed dictionary approaches with “glass-box” machine learning: The case of integrative complexity in social media comments

Dobbrick, Timo ; Jakob, Julia ; Chan, Chung-hong ; Wessler, Hartmut

DOI:	https://doi.org/10.1080/19312458.2021.1999913
URL:	https://www.tandfonline.com/doi/full/10.1080/19312...
Weitere URL:	https://www.researchgate.net/publication/356339614...
Dokumenttyp:	Zeitschriftenartikel
Erscheinungsjahr:	2021
Titel einer Zeitschrift oder einer Reihe:	Communication Methods and Measures
Band/Volume:	16
Heft/Issue:	4
Seitenbereich:	303-320
Ort der Veröffentlichung:	Philadelphia, PA
Verlag:	Routledge, Taylor & Francis Group
ISSN:	1931-2458 , 1931-2466
Sprache der Veröffentlichung:	Englisch
Einrichtung:	Außerfakultäre Einrichtungen > MZES - Arbeitsbereich B Philosophische Fakultät > Medien- und Kommunikationswissenschaft (Wessler 2007-)
Fachgebiet:	070 Nachrichtenmedien, Journalismus, Verlagswesen
Abstract:	Dictionary-based approaches to computational text analysis have been shown to perform relatively poorly, particularly when the dictionaries rely on simple bags of words, are not specified for the domain under study, and add word scores without weighting. While machine learning approaches usually perform better, they offer little insight into (a) which of the assumptions underlying dictionary approaches (bag-of-words, domain transferability, or additivity) impedes performance most, and (b) which language features drive the algorithmic classification most strongly. To fill both gaps, we offer a systematic assumption-based error analysis, using the integrative complexity of social media comments as our case in point. We show that attacking the additivity assumption offers the strongest potential for improving dictionary performance. We also propose to combine off-the-shelf dictionaries with supervised “glass box” machine learning algorithms (as opposed to the usual “black box” machine learning approaches) to classify texts and learn about the most important features for classification. This dictionary-plus-supervised-learning approach performs similarly well as classic full-text machine learning or deep learning approaches, but yields interpretable results in addition, which can inform theory development on top of enabling a valid classification.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Suche Autoren in

BASE: Dobbrick, Timo ; Jakob, Julia ; Chan, Chung-hong ; Wessler, Hartmut

Google Scholar: Dobbrick, Timo ; Jakob, Julia ; Chan, Chung-hong ; Wessler, Hartmut

ORCID: Dobbrick, Timo

; Jakob, Julia

; Chan, Chung-hong

; Wessler, Hartmut

Aufruf-Statistik

Aufrufe im letzten Jahr

Detaillierte Angaben

Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail

Actions (login required)

Eintrag anzeigen

Enhancing theory-informed dictionary approaches with “glass-box” machine learning: The case of integrative complexity in social media comments

Metadaten-Export

Zitation

Suche Autoren in

Aufruf-Statistik

Actions (login required)