We present $\textbf{Platypus}$, a family of fine-tuned and merged LLMs that achieves the strongest performance and currently stands at first place in HuggingFace's Open LLM Leaderboard as of the release date of this work. In this work we describe (1) our curated dataset $\textbf{Open-Platypus}$, that is a subset of other open datasets and which $\textit{we release to the public}$ (2) our process of fine-tuning and merging LoRA modules in order to conserve the strong prior of pretrained LLMs, while bringing specific domain knowledge to the surface (3) our efforts in checking for test data leaks and contamination in the training data, which can inform future research. Specifically, the Platypus family achieves strong performance in quantitative LLM metrics across model sizes, topping the global Open LLM leaderboard while using just a fraction of the fine-tuning data and overall compute that are required for other state-of-the-art fine-tuned LLMs. In particular, a 13B Platypus model can be trained on $\textit{a single}$ A100 GPU using 25k questions in 5 hours. This is a testament of the quality of our Open-Platypus dataset, and opens opportunities for more improvements in the field. Project page: https://platypus-llm.github.io
The paper presents the Platypus family of LLMs and their superior performance on the HuggingFace Open LLM Leaderboard, achieved through efficient fine-tuning techniques with minimal data and computational resources.
Key contributions include the development of the Open-Platypus dataset, utilization of Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning, comprehensive data de-duplication processes, and model merging strategies to enhance overall performance.
The research demonstrates significant numerical results, with the 13B Platypus model trained on a single A100 GPU in 5 hours, and the Platypus2-70B-instruct variant achieving the highest average score on the Hugging Face Open LLM Leaderboard.
The paper "Platypus: Quick, Cheap, and Powerful Refinement of LLMs" introduces the Platypus family of LLMs, which have demonstrated superior performance on the HuggingFace Open LLM Leaderboard. The research tackles the challenge of fine-tuning LLMs while minimizing computational resources and data requirements, and importantly, avoids data contamination between training and test sets. The authors present a comprehensive methodology, along with their curated Open-Platypus dataset, which collectively enable robust yet efficient fine-tuning of LLMs.
The core contributions of the paper are manifold:
The paper presents strong numerical results, demonstrating that the 13B Platypus model can be trained on a single A100 GPU using 25k questions in just 5 hours. The Platypus2-70B-instruct variant achieved the highest average score on the Hugging Face Open LLM Leaderboard with 73.13%, surpassing both open-source and proprietary models like GPT-3.5 and GPT-4 in certain metrics.
Future research could delve into several avenues:
The research presented in this paper provides a significant step forward in the efficient fine-tuning of LLMs. The Platypus family leverages a small but potent dataset, advanced fine-tuning techniques, and rigorous data validation methods to achieve top-tier performance with reduced computational demands. These advancements not only facilitate broader access to powerful LLMs but also lay the groundwork for future innovations in model tuning and merging methodologies.