Few-shot clinical entity recognition in English, French and Spanish: masked language models outperform generative model prompting (2402.12801v2)
Abstract: LLMs have become the preferred solution for many natural language processing tasks. In low-resource environments such as specialized domains, their few-shot capabilities are expected to deliver high performance. Named Entity Recognition (NER) is a critical task in information extraction that is not covered in recent LLM benchmarks. There is a need for better understanding the performance of LLMs for NER in a variety of settings including languages other than English. This study aims to evaluate generative LLMs, employed through prompt engineering, for few-shot clinical NER. %from the perspective of F1 performance and environmental impact. We compare 13 auto-regressive models using prompting and 16 masked models using fine-tuning on 14 NER datasets covering English, French and Spanish. While prompt-based auto-regressive models achieve competitive F1 for general NER, they are outperformed within the clinical domain by lighter biLSTM-CRF taggers based on masked models. Additionally, masked models exhibit lower environmental impact compared to auto-regressive models. Findings are consistent across the three languages studied, which suggests that LLM prompting is not yet suited for NER production in the clinical domain.
- doi:https://doi.org/10.1016/j.jbi.2009.08.007. URL https://www.sciencedirect.com/science/article/pii/S1532046409001087
- doi:https://doi.org/10.1016/j.jbi.2017.11.011. URL https://www.sciencedirect.com/science/article/pii/S1532046417302563
- doi:10.1186/s12859-017-1857-8.
- doi:https://doi.org/10.1016/j.jbi.2021.103684. URL https://www.sciencedirect.com/science/article/pii/S1532046421000137
- arXiv:https://academic.oup.com/bioinformatics/article-pdf/38/20/4837/46535173/btac598.pdf, doi:10.1093/bioinformatics/btac598. URL https://doi.org/10.1093/bioinformatics/btac598
- doi:https://doi.org/10.1016/j.artmed.2022.102311. URL https://www.sciencedirect.com/science/article/pii/S0933365722000768
- arXiv:https://academic.oup.com/jamia/article-pdf/27/10/1529/39739985/ocaa106.pdf, doi:10.1093/jamia/ocaa106. URL https://doi.org/10.1093/jamia/ocaa106
- doi:https://doi.org/10.1016/j.jbi.2015.07.010. URL https://www.sciencedirect.com/science/article/pii/S1532046415001501
- doi:10.1109/TKDE.2020.2981314. URL https://doi.org/10.1109/TKDE.2020.2981314
- doi:10.1145/3522593. URL https://doi.org/10.1145/3522593
- doi:https://doi.org/10.1016/j.jbi.2021.103799. URL https://www.sciencedirect.com/science/article/pii/S1532046421001283
- doi:10.18653/v1/N18-1202. URL https://aclanthology.org/N18-1202
- doi:10.18653/v1/P19-1236. URL https://aclanthology.org/P19-1236
- doi:10.1609/aaai.v35i15.17587. URL https://ojs.aaai.org/index.php/AAAI/article/view/17587
- doi:https://doi.org/10.1016/j.jbi.2013.12.006. URL https://www.sciencedirect.com/science/article/pii/S1532046413001974
- doi:10.18653/v1/2020.clinicalnlp-1.32. URL https://aclanthology.org/2020.clinicalnlp-1.32
- doi:10.18653/v1/2022.acl-long.192. URL https://aclanthology.org/2022.acl-long.192
- doi:10.18653/v1/2022.naacl-main.380. URL https://aclanthology.org/2022.naacl-main.380
- doi:10.18653/v1/2021.emnlp-main.407. URL https://aclanthology.org/2021.emnlp-main.407
- doi:10.18653/v1/2021.naacl-main.410. URL https://aclanthology.org/2021.naacl-main.410
- doi:10.18653/v1/2022.acl-long.556. URL https://aclanthology.org/2022.acl-long.556
- doi:10.18653/v1/2022.emnlp-main.759. URL https://aclanthology.org/2022.emnlp-main.759
- arXiv:2304.10428.
- doi:10.18653/v1/2022.findings-emnlp.329. URL https://aclanthology.org/2022.findings-emnlp.329
- doi:10.1145/3297280.3297378. URL https://doi.org/10.1145/3297280.3297378
- doi:10.18653/v1/2020.emnlp-main.516. URL https://aclanthology.org/2020.emnlp-main.516
- doi:10.18653/v1/2021.emnlp-main.813. URL https://aclanthology.org/2021.emnlp-main.813
- doi:10.18653/v1/2021.acl-long.120. URL https://aclanthology.org/2021.acl-long.120
- doi:10.18653/v1/2022.findings-acl.155. URL https://aclanthology.org/2022.findings-acl.155
- doi:10.18653/v1/2020.acl-main.128. URL https://aclanthology.org/2020.acl-main.128
- doi:10.18653/v1/2023.acl-long.859. URL https://aclanthology.org/2023.acl-long.859
- doi:10.18653/v1/2022.naacl-main.420. URL https://aclanthology.org/2022.naacl-main.420
- doi:10.3390/app13148359. URL https://www.mdpi.com/2076-3417/13/14/8359
- doi:10.18653/v1/2020.acl-main.519. URL https://aclanthology.org/2020.acl-main.519
- doi:10.18653/v1/2023.acl-long.764. URL https://aclanthology.org/2023.acl-long.764
- doi:10.18653/v1/2021.findings-acl.161. URL https://aclanthology.org/2021.findings-acl.161
- doi:10.18653/v1/2023.acl-long.698. URL https://aclanthology.org/2023.acl-long.698
- doi:10.3390/info14050262. URL https://www.mdpi.com/2078-2489/14/5/262
- doi:10.18653/v1/2021.naacl-main.185. URL https://aclanthology.org/2021.naacl-main.185
- doi:https://doi.org/10.1016/j.jbi.2023.104458. URL https://www.sciencedirect.com/science/article/pii/S153204642300179X
- doi:https://doi.org/10.1016/j.artmed.2021.102086. URL https://www.sciencedirect.com/science/article/pii/S0933365721000798
- doi:10.18653/v1/2023.acl-long.233. URL https://aclanthology.org/2023.acl-long.233
- doi:https://doi.org/10.1016/j.artint.2012.03.006. URL https://www.sciencedirect.com/science/article/pii/S0004370212000276
- doi:https://doi.org/10.1016/j.jbi.2019.103132. URL https://www.sciencedirect.com/science/article/pii/S1532046419300504
- doi:10.3233/978-1-60750-928-8-216.
- doi:10.18653/v1/2020.acl-main.747. URL https://aclanthology.org/2020.acl-main.747
- doi:10.18653/v1/W19-1909. URL https://aclanthology.org/W19-1909
- doi:10.23919/APSIPAASC55919.2022.9980157.
- doi:10.18653/v1/2020.acl-main.645. URL https://aclanthology.org/2020.acl-main.645
- doi:10.18653/v1/2023.acl-long.896. URL https://aclanthology.org/2023.acl-long.896
- doi:10.18653/v1/2022.bionlp-1.19. URL https://aclanthology.org/2022.bionlp-1.19
- doi:10.18653/v1/D17-1035. URL https://aclanthology.org/D17-1035
- doi:10.18653/v1/p19-1266. URL https://doi.org/10.18653/v1/p19-1266
- doi:10.1162/tacla00041. URL https://aclanthology.org/Q18-1041
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.