Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models

Published 14 Oct 2019 in cs.CL and cs.LG | (1910.06294v2)

Abstract: Training models on low-resource named entity recognition tasks has been shown to be a challenge, especially in industrial applications where deploying updated models is a continuous effort and crucial for business operations. In such cases there is often an abundance of unlabeled data, while labeled data is scarce or unavailable. Pre-trained LLMs trained to extract contextual features from text were shown to improve many NLP tasks, including scarcely labeled tasks, by leveraging transfer learning. However, such models impose a heavy memory and computational burden, making it a challenge to train and deploy such models for inference use. In this work-in-progress we combined the effectiveness of transfer learning provided by pre-trained masked LLMs with a semi-supervised approach to train a fast and compact model using labeled and unlabeled examples. Preliminary evaluations show that the compact models can achieve competitive accuracy with 36x compression rate when compared with a state-of-the-art pre-trained LLM, and run significantly faster in inference, allowing deployment of such models in production environments or on edge devices.