Positioning Political Texts with Large Language Models by Asking and Averaging (2311.16639v3)
Abstract: We use instruction-tuned LLMs like GPT-4, Llama 3, MiXtral, or Aya to position political texts within policy and ideological spaces. We ask an LLM where a tweet or a sentence of a political text stands on the focal dimension and take the average of the LLM responses to position political actors such as US Senators, or longer texts such as UK party manifestos or EU policy speeches given in 10 different languages. The correlations between the position estimates obtained with the best LLMs and benchmarks based on text coding by experts, crowdworkers, or roll call votes exceed .90. This approach is generally more accurate than the positions obtained with supervised classifiers trained on large amounts of research data. Using instruction-tuned LLMs to position texts in policy and ideological spaces is fast, cost-efficient, reliable, and reproducible (in the case of open LLMs) even if the texts are short and written in different languages. We conclude with cautionary notes about the need for empirical validation.
- BarberĆ”, P. (2015). Birds of the same feather tweet together: Bayesian ideal point estimation using twitter data. Political analysisĀ 23(1), 76ā91.
- Benoit, K. etĀ al. (2020). Text as data: An overview. The SAGE handbook of research methods in political science and international relations, 461ā497.
- Crowd-sourced text analysis: Reproducible and agile production of political data. American Political Science ReviewĀ 110(2), 278ā295.
- Bonica, A. (2014). Mapping the ideological marketplace. American Journal of Political ScienceĀ 58(2), 367ā386.
- Bonica, A. (2023). Database on ideology, money in politics, and elections: Public version 3.0. Computer file.
- Language and ideology in congress. British Journal of Political ScienceĀ 42(1), 31ā55.
- Chatgpt outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of SciencesĀ 120(30), e2305016120.
- Measuring political positions from legislative speech. Political AnalysisĀ 24(3), 374ā394.
- Estimating policy positions from political texts. American Journal of Political Science, 619ā634.
- Uncovering the semantics of concepts using gpt-4. PNAS, forthcoming.
- Nokken, T.Ā P. and K.Ā T. Poole (2004). Congressional party defection in american history. Legislative Studies QuarterlyĀ 29(4), 545ā568.
- How to train your stochastic parrot: Large language models for political texts. Technical report, Working Paper.
- A spatial model for legislative roll call analysis. American journal of political science, 357ā384.
- Patterns of congressional voting. American journal of political science, 228ā278.
- Proksch, S.-O. and J.Ā B. Slapin (2010). Position taking in european parliament speeches. British Journal of Political ScienceĀ 40(3), 587ā611.
- Cultural fault lines and political polarization. In Proceedings of the 2017 ACM on web science conference, pp.Ā 213ā217.
- Slapin, J.Ā B. and S.-O. Proksch (2008). A scaling model for estimating time-series party positions from texts. American Journal of Political ScienceĀ 52(3), 705ā722.
- Tƶrnberg, P. (2023). Chatgpt-4 outperforms experts and crowd workers in annotating political twitter messages with zero-shot learning. arXiv preprint arXiv:2304.06588.
- Large language models can be used to estimate the latent positions of politicians. Working Paper, retrieved 2023/10/10.
- Can large language models transform computational social science? arXiv preprint arXiv:2305.03514.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.