PopAut: An Annotated Corpus for Populism Detection in Austrian News Comments
Description
Description:
Sample of 1,200 comments posted under articles of the Austrian Newspaper Der Standard collected between January 2019 and November 2021. This dataset is published in context with the paper "PopAut: An Annotated Corpus for Populism Detection in Austrian News Comments" and serves the purpose of detecting populist statements in user comments under news articles in the German language. Details about the sampling and annotation process can be found in the paper as well as the accompanying GitHub repository (https://github.com/ahmadouw/COV-Populism-Standard)
Abstract: Populism is a phenomenon that is noticeably present in the political landscape of various countries over the past decades. While populism expressed by politicians has been thoroughly examined in the literature, populism expressed by citizens is still underresearched, especially when it comes to its automated detection in text. This work presents the PopAut corpus, which is the first annotated corpus of news comments for populism in the German language. It features 1,200 comments collected between 2019-2021 that are annotated for populist motives anti-elitism, people-centrism and people-sovereignty. Following the definition of Cas Mudde, populism is seen as a thin ideology. This work shows that annotators reach a high agreement when labeling news comments for these motives. The data set is collected to serve as the basis for automated populism detection using machine-learning methods. By using transformer-based models, we can outperform existing dictionaries tailored for automated populism detection in German social media content. Therefore, our work provides a rich resource for future work on the classification of populist user comments in the German language.
Structure
- Each row contains an anonymized user comment and the binary labels given by each of the three annotators (per comment) for every motive
- anti1, anti2, anti3 indicate whether or not anti-elitism was found in the given comment
- cent1, cent2, cent3 indicate whether or not people-centrism was found in the given comment
- sov1, sov2, sov3 indicate whether or not people-sovereignty was found in the given comment
- none1, none2, none 3 indicate whether or not none of the motives was found in the given comment
- Populism is the final label that is assigned by majority vote, if any of the motives is present in the given comment
Further Details
- The data set is available for researchers upon request