ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Does BERT find the 'families with children'? Comparing Methods of Identifying Group Appeals in Political Texts

Methods
Quantitative
Communication
Big Data
Marvin Stecker
University of Vienna
Marvin Stecker
University of Vienna
Hajo Boomgaarden
University of Vienna

Abstract

Social groups play an essential role in political debate. Their invocation by political actors can serve as a strategic communication tool in electoral campaigns, drawing symbolic boundaries to demarcate social groups and strengthening the association between collective identities and political parties (Huber, 2022; Kreiss & McGregor, 2022; Kreiss et al., 2020). Indeed, group mentions not only appeal to but help to constitute the electoral coalition which parties and politicians vie for by politicizing - seemingly benign - social distinctions (Disch, 2021; Harteveld, 2021; Proctor, 2022). Manual content analysis has started to offer comparative insights into the communicative uses of group appeals across different periods, fora and countries (e.g., Dolinsky, 2022; Huber, 2022; Stuckelberger & Tresch, 2022). However, particular sources of political text, especially parliamentary debates, offer a tantalizingly rich yet overwhelming amount of data for further studies. Different computational aids, from deductive to interpretative designs, could be used to analyze these, but all meet the same hurdle: finding group appeals in the first place. We compare different computational methods of detecting social groups in both English- and German political texts. Specifically, we look at using the affordances of word embeddings and larger language models, based on the pioneering work by Licht & Sczepanski (2022). We extend their empirical evidence by employing training samples in two languages to evaluate transfers and test fine-grained coding schemas. The methods are evaluated for their validity, accessibility and analytic potential. Validity is assessed through a specific coding of different group categories to investigate blind spots and performance across various groups to ensure research integrity. Accessibility, aiming to lower barriers of entry into computational methods, considers the strategies' transferability and local resource demands. Lastly, their analytic potentials are scrutinized, with the (dis-)advantages of approaches highlighted that operate at the word, paragraph, or corpus levels. We use data from parliamentary corpora and political manifestos, representing some of the most-used and studied forms of political communication, including many integration possibilities with secondary data sets, e.g., on policy appeals (Horn, 2021; Thau, 2021). We use German, Austrian, and UK manifestos available in the Manifesto-Projekt (Lehmann et al., 2022) and parliamentary data from the German, Austrian and UK parliaments (Blaette, 2020; Wissik & Pirker, 2018). Secondary data sources help ensure the quality of our data. Findings can help researchers make better-informed decisions on potential research avenues and what role automated content analysis methods can play. Similarly, we highlight the need for further development of these methods and highlight shortcomings.