Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 172 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 199 tok/s Pro
GPT OSS 120B 464 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Exploring the Value of Pre-trained Language Models for Clinical Named Entity Recognition (2210.12770v4)

Published 23 Oct 2022 in cs.CL, cs.AI, and cs.LG

Abstract: The practice of fine-tuning Pre-trained LLMs (PLMs) from general or domain-specific data to a specific task with limited resources, has gained popularity within the field of NLP. In this work, we re-visit this assumption and carry out an investigation in clinical NLP, specifically Named Entity Recognition on drugs and their related attributes. We compare Transformer models that are trained from scratch to fine-tuned BERT-based LLMs namely BERT, BioBERT, and ClinicalBERT. Furthermore, we examine the impact of an additional CRF layer on such models to encourage contextual learning. We use n2c2-2018 shared task data for model development and evaluations. The experimental outcomes show that 1) CRF layers improved all LLMs; 2) referring to BIO-strict span level evaluation using macro-average F1 score, although the fine-tuned LLMs achieved 0.83+ scores, the TransformerCRF model trained from scratch achieved 0.78+, demonstrating comparable performances with much lower cost - e.g. with 39.80\% less training parameters; 3) referring to BIO-strict span-level evaluation using weighted-average F1 score, ClinicalBERT-CRF, BERT-CRF, and TransformerCRF exhibited lower score differences, with 97.59\%/97.44\%/96.84\% respectively. 4) applying efficient training by down-sampling for better data distribution further reduced the training cost and need for data, while maintaining similar scores - i.e. around 0.02 points lower compared to using the full dataset. Our models will be hosted at \url{https://github.com/HECTA-UoM/TransformerCRF}

Citations (4)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com