Emergent Mind

BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains

(2402.10373)
Published Feb 15, 2024 in cs.CL , cs.AI , and cs.LG

Abstract

LLMs have demonstrated remarkable versatility in recent years, offering potential applications across specialized domains such as healthcare and medicine. Despite the availability of various open-source LLMs tailored for health contexts, adapting general-purpose LLMs to the medical domain presents significant challenges. In this paper, we introduce BioMistral, an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model and further pre-trained on PubMed Central. We conduct a comprehensive evaluation of BioMistral on a benchmark comprising 10 established medical question-answering (QA) tasks in English. We also explore lightweight models obtained through quantization and model merging approaches. Our results demonstrate BioMistral's superior performance compared to existing open-source medical models and its competitive edge against proprietary counterparts. Finally, to address the limited availability of data beyond English and to assess the multilingual generalization of medical LLMs, we automatically translated and evaluated this benchmark into 7 other languages. This marks the first large-scale multilingual evaluation of LLMs in the medical domain. Datasets, multilingual evaluation benchmarks, scripts, and all the models obtained during our experiments are freely released.

BioMistral 7B's performance measured by the loss metric.

Overview

  • BioMistral is an open-source collection of Pretrained LLMs optimized for medical domains, built upon the Mistral foundation model and further trained on PubMed Central.

  • The model showcases improvements in medical QA tasks, multilingual evaluation, and operational efficiency through advanced quantization and model merging techniques.

  • It has undergone extensive evaluation, showing superior performance over other medical models in both monolingual and multilingual settings.

  • BioMistral's development opens doors for advanced healthcare applications and invites further global research and adaptation efforts.

Enhancing Medical Domain Understanding with BioMistral: Open-Source Pretrained LLMs

Introduction to BioMistral

The paper presents BioMistral, a set of Open-Source Pretrained LLMs optimized for applications within the medical domain. Based on the Mistral foundation model and enriched by further pretraining on PubMed Central, BioMistral represents a significant step towards making robust, domain-specific NLP capabilities more accessible to researchers and practitioners in the field of healthcare and medicine.

Distinctive Features of BioMistral

BioMistral introduces several innovations and improvements over existing medical LLMs:

  • Tailored Domain Optimization: Through further pre-training Mistral on a meticulously curated subset of PubMed Central, BioMistral achieves superior performance on a wide array of medical QA tasks.
  • Multilingual Evaluation: It expands the evaluation landscape by translating a benchmark of 10 medical QA tasks into seven languages, thus assessing the multilingual efficacy of medical LLMs at a scale previously unexplored.
  • Efficiency through Quantization: Through various quantization and model merging techniques, BioMistral models exhibit not just excellence in performance but also in operational efficiency, making them amenable for deployment on consumer-grade hardware.

Comprehensive Evaluation

BioMistral underwent a rigorous evaluation on a novel benchmark comprising 10 medical QA tasks. It demonstrated statistically significant improvements over other open-source medical models and holds its ground against proprietary models in terms of performance. In multilingual contexts, although there's an observable performance dip compared to English tasks, BioMistral's impressive array of language models still outperforms existing models, underscoring its robustness and adaptability across linguistic boundaries.

The Mechanics of Model Adaptation

The adaptation method involves pre-training the Mistral model using a dataset drawn from the PMC Open Access Subset to embed biomedical specificity into BioMistral. This process, aimed at enhancing the model's understanding of complex medical contexts, employs various strategies including AdamW optimization and Grouped-Query Attention, ensuring the model's adeptness at medical domain tasks.

Model Merging and Quantization Strategies

Model merging experiments, using techniques such as SLERP and TIES, indicated that combining specialized and general-domain models can result in improved performance and generalization capabilities. Furthermore, experiments with activation-aware weight quantization and other strategies underscore the potential for deploying BioMistral on devices with limited computational resources without significant loss in performance.

Practical Implications and Future Prospects

BioMistral holds promise for a variety of applications in healthcare and medicine, from enhancing medical literature search capabilities to facilitating patient care through improved understanding of medical queries. Its open-source nature invites further experimentation and adaptation by the global research community. The work paves the way for future developments, particularly in advancing model calibration, reliability, and multilingual capabilities, as well as exploring domain-specific adaptations beyond the sphere of medicine.

Key Contributions

  • Domain-Specific Pretraining: Leveraging PubMed Central to train Mistral model variants tailored for the biomedical domain.
  • Multilingual Benchmark Creation: Extending the evaluation of medical LLMs to additional languages.
  • Advanced Model Quantization: Implementing quantization techniques that allow performance optimization without sacrificing accuracy.

Conclusion

BioMistral represents a significant advancement in the development of domain-specific LLMs for the biomedical field, showing marked improvements over existing models across a range of metrics. By combining the foundational strengths of Mistral with advanced pre-training and model optimization techniques, BioMistral emerges as a powerful tool for researchers and practitioners working at the intersection of AI and healthcare. The open-source release of datasets, benchmarks, and models underlines the authors' commitment to transparency and collaboration in advancing the state of the art in medical NLP.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube