Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena (2403.06965v1)
Abstract: Argument Structure Constructions (ASCs) are one of the most well-studied construction groups, providing a unique opportunity to demonstrate the usefulness of Construction Grammar (CxG). For example, the caused-motion construction (CMC, She sneezed the foam off her cappuccino'') demonstrates that constructions must carry meaning, otherwise the fact that
sneeze'' in this context causes movement cannot be explained. We form the hypothesis that this remains challenging even for state-of-the-art LLMs, for which we devise a test based on substituting the verb with a prototypical motion verb. To be able to perform this test at statistically significant scale, in the absence of adequate CxG corpora, we develop a novel pipeline of NLP-assisted collection of linguistically annotated text. We show how dependency parsing and GPT-3.5 can be used to significantly reduce annotation cost and thus enable the annotation of rare phenomena at scale. We then evaluate GPT, Gemini, Llama2 and Mistral models for their understanding of the CMC using the newly collected corpus. We find that all models struggle with understanding the motion component that the CMC adds to a sentence.
- The pushshift reddit dataset. CoRR, abs/2001.08435.
- Giulia ML Bencini and Adele E Goldberg. 2000. The contribution of argument structure constructions to sentence meaning. Journal of Memory and Language, 43(4):640–651.
- Language models are few-shot learners.
- Noam Chomsky. 1993. Lectures on government and binding: The Pisa lectures. 9. Walter de Gruyter.
- William Croft. 2001. Radical construction grammar: Syntactic theory in typological perspective. Oxford University Press on Demand.
- Universal Dependencies. Computational Linguistics, 47(2):255–308.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Chatgpt outperforms crowd-workers for text-annotation tasks. arXiv preprint arXiv:2303.15056.
- Adele E.. Goldberg. 1995. Constructions: A construction grammar approach to argument structure. University of Chicago Press.
- Adele Eva Goldberg. 1992. Argument structure constructions. University of California, Berkeley.
- Can gpt alleviate the burden of annotation? In Legal Knowledge and Information Systems, pages 157–166. IOS Press.
- Ole Magnus Holter and Basil Ell. 2023. Human-machine collaborative annotation: A case study with gpt-3. In Proceedings of the 4th Conference on Language, Data and Knowledge, pages 193–206.
- Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear.
- Haerim Hwang and Hyunwoo Kim. 2023. Automatic analysis of constructional diversity as a predictor of efl students’ writing proficiency. Applied Linguistics, 44(1):127–147.
- Jena D. Hwang and Martha Palmer. 2015. Identification of caused motion construction. In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, pages 51–60, Denver, Colorado. Association for Computational Linguistics.
- Mistral 7b.
- Mixtral of experts.
- Clarin-emo: Training emotion recognition models using human annotation and chatgpt. In International Conference on Computational Science, pages 365–379. Springer.
- Kristopher Kyle and Hakyung Sung. 2023. An argument structure construction treebank. In Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023), pages 51–62, Washington, D.C. Association for Computational Linguistics.
- Neural reality of argument structure constructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7410–7423, Dublin, Ireland. Association for Computational Linguistics.
- Kyle Mahowald. 2023. A discerning several thousand judgments: Gpt-3 rates the article+ adjective+ numeral+ noun construction. arXiv preprint arXiv:2301.12564.
- OpenAI. 2022. Chatgpt: Optimizing language models for dialogue.
- Automated annotation with generative ai requires validation. arXiv preprint arXiv:2306.00176.
- Siyao Peng and Amir Zeldes. 2018. All roads lead to UD: Converting Stanford and Penn parses to English Universal Dependencies with multilayer annotations. In Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pages 167–177, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Framenet ii: Extended theory and practice. Technical report, International Computer Science Institute.
- Jaromir Savelka and Kevin D Ashley. 2023. The unreasonable effectiveness of large language models in zero-shot semantic annotation of legal texts. Frontiers in Artificial Intelligence, 6.
- Syntactic search by example. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 17–23, Online. Association for Computational Linguistics.
- CxGBERT: BERT meets construction grammar. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4020–4032, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Gemini: A family of highly capable multimodal models.
- Copilots for Linguists: AI, Constructions, and Frames. Cambridge University Press.
- Llama 2: Open foundation and fine-tuned chat models.
- CxLM: A construction and context-aware language model. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6361–6369, Marseille, France. European Language Resources Association.
- Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 7:625–641.
- The better your syntax, the better your semantics? probing pretrained language models for the English comparative correlative. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10859–10882, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Assessing the potential of llm-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apologies. International Journal of Corpus Linguistics.
- Leonie Weissweiler (19 papers)
- Abdullatif Köksal (22 papers)
- Hinrich Schütze (250 papers)