Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models (2403.03432v1)

Published 6 Mar 2024 in cs.CL and cs.AI

Abstract: Instruction Tuning has the potential to stimulate or enhance specific capabilities of LLMs. However, achieving the right balance of data is crucial to prevent catastrophic forgetting and interference between tasks. To address these limitations and enhance training flexibility, we propose the Mixture-of-LoRAs (MoA) architecture which is a novel and parameter-efficient tuning method designed for multi-task learning with LLMs. In this paper, we start by individually training multiple domain-specific LoRA modules using corresponding supervised corpus data. These LoRA modules can be aligned with the expert design principles observed in Mixture-of-Experts (MoE). Subsequently, we combine the multiple LoRAs using an explicit routing strategy and introduce domain labels to facilitate multi-task learning, which help prevent interference between tasks and ultimately enhances the performance of each individual task. Furthermore, each LoRA model can be iteratively adapted to a new domain, allowing for quick domain-specific adaptation. Experiments on diverse tasks demonstrate superior and robust performance, which can further promote the wide application of domain-specific LLMs.

References (26)

Citations (6)

View on Semantic Scholar

Summary

The paper presents a novel MoA architecture that integrates domain-specific LoRA modules with an explicit routing strategy to mitigate task interference.
It employs a two-stage learning process where individual experts are trained per domain and optimally selected during inference using sequence-level routing.
Experimental results across finance, medicine, and coding tasks validate improved performance metrics, underscoring the method's scalability and adaptability.

Mixture-of-LoRAs: Revolutionizing Multitask Learning in LLMs Through Efficient Tuning

Introduction to Mixture-of-LoRAs Architecture

The ever-increasing complexity of tasks demandable from LLMs poses a challenge in maintaining their versatility while enhancing their specialization in domain-specific capabilities. Traditional methods, albeit effective, often encounter issues such as catastrophic forgetting and task interference particularly when deployed on multitask settings. Addressing these constraints, the Mixture-of-LoRAs (MoA) architecture presents an innovative, parameter-efficient tuning method designed to optimize multi-task learning in LLMs. MoA leverages individual domain-specific LoRA modules, trained on corresponding data sets, and integrates them using an explicit routing strategy. This integration not only averts interference among tasks but also enriches the model's performance on individual tasks, rendering the model adaptable to new domains swiftly.

Methodology Behind MoA

The MoA architecture adopts a two-stage methodology to enrich LLMs with multi-task learning capabilities without succumbing to the common pitfalls of conventional methods.

Learning Algorithm: Initially, separate LoRA modules are trained across varied domain tasks, ensuring domain-specific expertise while mitigating catastrophic forgetting. These modules, deemed as domain-specific experts, are then amalgamated employing a routing mechanism that effectively selects the appropriate expert during both training and inference stages.

Routing Strategy: A distinct feature of the MoA architecture is its sequence-level routing strategy, leveraging domain labels to orchestrate data flow across the LoRA experts. This strategy transcends the conventional token-level routing by facilitating precise expert selection, thereby enhancing the model's efficiency in inference and augmenting task-specific performance.

Architecture Realization: The practical embodiment of MoA situates multiple LoRA modules alongside each transformer layer of the LLM, each coupled with a router that governs the selection of the pertinent expert based on the task at hand. This setup not merely accommodates the simultaneous deployment of multiple domain tasks but also embraces the flexibility of expanding or optimizing individual modules independently.

Experimental Validation

The effectiveness of MoA was rigorously validated across a suite of SFT datasets pertaining to heterogeneous domains, including finance, medicine, and coding challenges among others. The experimental setup involved benchmarks on perplexity, BLEU, and ROUGE-L metrics, positioning MoA against traditional single-LoRA approaches and a mixed domain LoRA model. The results unequivocally indicated MoA's superior performance in enhancing the LLM's capability across various tasks, showcasing its robustness and adaptability.

Implications and Future Directions

MoA sets a precedent in multitask learning by introducing an efficient, scalable, and flexible architecture that conserves the integrity of domain-specific knowledge while fostering a synergetic environment for knowledge sharing among tasks. The architecture not only delineates a path forward for developing multifaceted LLMs but also opens avenues for research into optimizing routing strategies and exploring unsupervised domain adaptation methods.

Concluding Remarks

Mixture-of-LoRAs decidedly marks a progressive stride toward realizing truly versatile and adaptable LLMs capable of multitask learning. By mitigating task interference and enhancing model performance on individual domains, MoA materializes a significant leap in the exploration of domain-specific LLM applications, poised to inspire further advancements in the field of artificial intelligence and natural language processing.

PDF Markdown

Related Papers

Reddit

[R] Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models (18 points, 4 comments)