Emergent Mind

Is GPT-3 a Good Data Annotator?

(2212.10450)
Published Dec 20, 2022 in cs.CL

Abstract

Data annotation is the process of labeling data that could be used to train machine learning models. Having high-quality annotation is crucial, as it allows the model to learn the relationship between the input data and the desired output. GPT-3, a large-scale language model developed by OpenAI, has demonstrated impressive zero- and few-shot performance on a wide range of NLP tasks. It is therefore natural to wonder whether it can be used to effectively annotate data for NLP tasks. In this paper, we evaluate the performance of GPT-3 as a data annotator by comparing it with traditional data annotation methods and analyzing its output on a range of tasks. Through this analysis, we aim to provide insight into the potential of GPT-3 as a general-purpose data annotator in NLP.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a detailed summary of this paper with a premium account.

We ran into a problem analyzing this paper.

Subscribe by Email

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

References
  1. GPT-NeoX-20B: An open-source autoregressive language model. In Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models, pages 95–136, virtual+Dublin. Association for Computational Linguistics.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  3. Why it is hard to find ai in smes: A survey from the practice and how to promote it. In ICAART.
  4. An empirical survey of data augmentation for limited data learning in nlp. Transactions of the Association for Computational Linguistics, 11:191–211.
  5. PaLM: Scaling Language Modeling with Pathways
  6. When low resource nlp meets unsupervised language model: Meta-pretraining then meta-learning for few-shot text classification (student abstract). In AAAI Conference on Artificial Intelligence.
  7. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  8. GlobalWoZ: Globalizing MultiWoZ to develop multilingual task-oriented dialogue systems. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1639–1657, Dublin, Ireland. Association for Computational Linguistics.
  9. Daga: Data augmentation with a generation approach for low-resource tagging tasks. In Conference on Empirical Methods in Natural Language Processing.
  10. Prompt-Learning for Fine-Grained Entity Typing
  11. OpenPrompt: An Open-source Framework for Prompt-learning
  12. Compositional semantic parsing with large language models. In The Eleventh International Conference on Learning Representations.
  13. A survey on data augmentation approaches for nlp
  14. Making Pre-trained Language Models Better Few-shot Learners
  15. Colin Shunryu Garvey. 2018. A framework for evaluating barriers to the democratization of artificial intelligence. In AAAI Conference on Artificial Intelligence.
  16. ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
  17. Domain adaptation for large-scale sentiment classification: A deep learning approach. In International Conference on Machine Learning.
  18. FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4803–4809, Brussels, Belgium. Association for Computational Linguistics.
  19. On the effectiveness of adapter-based tuning for pretrained language model adaptation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2208–2222, Online. Association for Computational Linguistics.
  20. Training Compute-Optimal Large Language Models
  21. Exploring Self-supervised Logic-enhanced Training for Large Language Models
  22. Ask Me What You Need: Product Retrieval using Knowledge from GPT-3
  23. Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems.
  24. The Power of Scale for Parameter-Efficient Prompt Tuning
  25. Solving Quantitative Reasoning Problems with Language Models
  26. Does gpt-3 demonstrate psychopathy? evaluating large language models from a psychological perspective
  27. Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
  28. A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability
  29. Wanli: Worker and ai collaboration for natural language inference dataset creation
  30. Mulda: A multilingual data augmentation framework for low-resource cross-lingual ner. In Annual Meeting of the Association for Computational Linguistics.
  31. Enhancing multilingual language model with massive multilingual knowledge triples
  32. Adversarial multi-task learning for text classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1–10, Vancouver, Canada. Association for Computational Linguistics.
  33. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys (CSUR).
  34. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
  35. RoBERTa: A Robustly Optimized BERT Pretraining Approach
  36. Crossner: Evaluating cross-domain named entity recognition. In AAAI Conference on Artificial Intelligence.
  37. Biogpt: Generative pre-trained transformer for biomedical text generation and mining. Briefings in bioinformatics.
  38. Generating training data with language models: Towards zero-shot language understanding. In Advances in Neural Information Processing Systems.
  39. Adversarial training methods for semi-supervised text classification. arXiv: Machine Learning.
  40. OpenAI. 2023. Gpt-4 technical report. arXiv.
  41. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems.
  42. Deep contextualized word representations. In North American Chapter of the Association for Computational Linguistics.
  43. Chengwei Qin and Shafiq Joty. 2022a. Continual few-shot relation learning via embedding space regularization and data augmentation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2776–2789, Dublin, Ireland. Association for Computational Linguistics.
  44. Chengwei Qin and Shafiq Joty. 2022b. LFPT5: A unified framework for lifelong few-shot language learning based on prompt tuning of t5. In International Conference on Learning Representations.
  45. Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?
  46. Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
  47. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  48. Scaling Language Models: Methods, Analysis & Insights from Training Gopher
  49. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  50. “democratizing” artificial intelligence in medicine and healthcare: Mapping the uses of an elusive term. Frontiers in Genetics, 13.
  51. Teven Le Scao and Alexander M. Rush. 2021. How many data points is a prompt worth? In North American Chapter of the Association for Computational Linguistics.
  52. Recursive deep models for semantic compositionality over a sentiment treebank. In Conference on Empirical Methods in Natural Language Processing.
  53. Galactica: A Large Language Model for Science
  54. LaMDA: Language Models for Dialog Applications
  55. LLaMA: Open and Efficient Foundation Language Models
  56. Want to reduce labeling cost? gpt-3 can help. In Conference on Empirical Methods in Natural Language Processing.
  57. Finetuned Language Models Are Zero-Shot Learners
  58. Chain of thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems.
  59. Jason Wei and Kai Zou. 2019. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In Conference on Empirical Methods in Natural Language Processing.
  60. Unsupervised data augmentation for consistency training. Advances in neural information processing systems, 33:6256–6268.
  61. Learning span-level interactions for aspect sentiment triplet extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4755–4766, Online. Association for Computational Linguistics.
  62. Position-aware tagging for aspect sentiment triplet extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2339–2349, Online. Association for Computational Linguistics.
  63. Generative data augmentation for commonsense reasoning. Findings of the Association for Computational Linguistics: EMNLP 2020.
  64. Xlnet: Generalized autoregressive pretraining for language understanding. In Neural Information Processing Systems.
  65. OPT: Open Pre-trained Transformer Language Models
  66. Retrieving Multimodal Information for Augmented Generation: A Survey

Show All 66