Examining Forgetting in Continual Pre-training of Aligned Large Language Models
(2401.03129)Abstract
Recent advances in LLMs have exhibited remarkable proficiency across various tasks. Given the potent applications of LLMs in numerous fields, there has been a surge in LLM development. In developing LLMs, a common practice involves continual pre-training on previously fine-tuned models. However, this can lead to catastrophic forgetting. In our work, we investigate the phenomenon of forgetting that occurs during continual pre-training on an existing fine-tuned LLM. We evaluate the impact of continuous pre-training on the fine-tuned LLM across various dimensions, including output format, knowledge, and reliability. Experiment results highlight the non-trivial challenge of addressing catastrophic forgetting during continual pre-training, especially the repetition issue.
We're not able to analyze this paper right now due to high demand.
Please check back later (sorry!).
Generate a summary of this paper on our Pro plan:
We ran into a problem analyzing this paper.