Emergent Mind

SPLADE-v3: New baselines for SPLADE

(2403.06789)
Published Mar 11, 2024 in cs.IR and cs.CL

Abstract

A companion to the release of the latest version of the SPLADE library. We describe changes to the training structure and present our latest series of models -- SPLADE-v3. We compare this new version to BM25, SPLADE++, as well as re-rankers, and showcase its effectiveness via a meta-analysis over more than 40 query sets. SPLADE-v3 further pushes the limit of SPLADE models: it is statistically significantly more effective than both BM25 and SPLADE++, while comparing well to cross-encoder re-rankers. Specifically, it gets more than 40 MRR@10 on the MS MARCO dev set, and improves by 2% the out-of-domain results on the BEIR benchmark.

Meta-analysis comparing SPLADE-v3 and DeBERTaV3's re-ranking effectiveness on the top-50 results.

Overview

  • SPLADE-v3 introduces advancements in information retrieval with sparse representations, showing significant improvements over previous SPLADE models and benchmarks like BM25.

  • Key enhancements include the use of multiple negatives per batch, a novel approach to distillation scores, combining of distillation losses, and strategic fine-tuning for superior performance.

  • The evaluation demonstrates SPLADE-v3's consistent outperformance of BM25 and its effectiveness over SPLADE++SelfDistil and comparable results to cross-encoder re-rankers across a variety of datasets.

  • SPLADE-v3 introduces three variants targeting specific applications, signaling a noteworthy progression in natural language processing for information retrieval tasks.

Enhancements in SPLADE Models: An Examination of SPLADE-v3

Introduction to SPLADE-v3

The technical report introduces SPLADE-v3, an advancement in the SPLADE series of models designed for improved information retrieval. SPLADE, or Sparse Lexical And Expansion Dense, models are distinguished by their ability to efficiently handle natural language queries by utilizing sparse representations. This iteration, SPLADE-v3, leverages modifications to the training structure to achieve statistically significant improvements over its predecessors and benchmark models like BM25, and it performs comparably to cross-encoder re-rankers.

Key Innovations in Model Training

Multiple Negatives per Batch

Incorporating guidance from the Tevatron framework, SPLADE-v3 is trained with an augmented number of hard negatives per batch. This strategy enhances results, particularly in in-domain settings, although it exhibits limited contributions to out-of-domain generalization.

Distillation Score Enhancement

A notable change involves the use of an ensemble of cross-encoder re-rankers to generate distillation scores. This method diverges from the traditional use of a single model for distillation, opting instead for a combination approach which, when coupled with affine transformations, yields superior model effectiveness.

Combining Distillation Losses

The report discusses merging two primary distillation losses used in information retrieval: KL-Div and MarginMSE. This hybrid approach, dictated by empirical findings on the losses’ focuses on recall and precision, respectively, culminates in improved performance indicators for SPLADE-v3.

Fine-Tuning Details

An observable gain in effectiveness was realized by initiating SPLADE-v3's training from the SPLADE++SelfDistil model, as opposed to starting from more basic model checkpoints. This phenomenon suggests the potential workings of a form of curriculum learning, even though further exploration is necessary to understand the underlying mechanisms fully.

Performance Evaluation

The evaluation of SPLADE-v3 involved a comprehensive meta-analysis encompassing over 40 query sets across various datasets, using metrics like MRR@10 and nDCG@10. The findings indicate:

  • A consistent outperformance of BM25, with substantial gains in most of the 44 query sets.
  • Improved effectiveness over SPLADE++SelfDistil across numerous datasets, save for minor exceptions.
  • Comparable performance to cross-encoder re-rankers, notably for specific datasets where SPLADE-v3 either matched or exceeded the re-rankers' performance metrics.

Variants of SPLADE-v3

The report introduces three additional variants of SPLADE-v3, each tailored for specific applications:

  • SPLADE-v3-DistilBERT: Offers a reduced inference footprint by building upon DistilBERT.
  • SPLADE-v3-Lexical: Removes query expansion, favoring efficiency at the cost of reduced effectiveness in out-of-domain settings.
  • SPLADE-v3-Doc: Alters the training initiation point to CoCondenser and simplifies the query processing, striking a balance between efficiency and efficacy.

Conclusion and Forward Look

SPLADE-v3 and its variants represent a significant step forward in the SPLADE research direction. The model's enhanced effectiveness, combined with its compelling comparison to other state-of-the-art approaches, underscores the potential of SPLADE models in tackling complex information retrieval tasks. As SPLADE-v3 sets new benchmarks, it invites further exploration into optimizing model training approaches and expanding the application horizons for SPLADE models in the field of natural language processing and beyond.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.