Continual Training of Language Models for Few-Shot Learning

Published 11 Oct 2022 in cs.CL, cs.AI, cs.LG, and cs.NE | (2210.05549v1)

Abstract: Recent work on applying large LMs achieves impressive performance in many NLP applications. Adapting or posttraining an LM using an unlabeled domain corpus can produce even better performance for end-tasks in the domain. This paper proposes the problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills. The goal is to improve the few-shot end-task learning in these domains. The resulting system is called CPT (Continual PostTraining), which to our knowledge, is the first continual post-training system. Experimental results verify its effectiveness.