Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Chatbots Reliable Text Annotators? Sometimes (2311.05769v2)

Published 9 Nov 2023 in cs.CL and cs.AI

Abstract: Recent research highlights the significant potential of ChatGPT for text annotation in social science research. However, ChatGPT is a closed-source product which has major drawbacks with regards to transparency, reproducibility, cost, and data protection. Recent advances in open-source (OS) LLMs offer an alternative without these drawbacks. Thus, it is important to evaluate the performance of OS LLMs relative to ChatGPT and standard approaches to supervised machine learning classification. We conduct a systematic comparative evaluation of the performance of a range of OS LLMs alongside ChatGPT, using both zero- and few-shot learning as well as generic and custom prompts, with results compared to supervised classification models. Using a new dataset of tweets from US news media, and focusing on simple binary text annotation tasks, we find significant variation in the performance of ChatGPT and OS models across the tasks, and that the supervised classifier using DistilBERT generally outperforms both. Given the unreliable performance of ChatGPT and the significant challenges it poses to Open Science we advise caution when using ChatGPT for substantive text annotation tasks.

Citations (6)

Summary

We haven't generated a summary for this paper yet.