CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation (2404.14774v2)

Published 23 Apr 2024 in cs.IR

Abstract: Embedding-based retrieval serves as a dominant approach to candidate item matching for industrial recommender systems. With the success of generative AI, generative retrieval has recently emerged as a new retrieval paradigm for recommendation, which casts item retrieval as a generation problem. Its model consists of two stages: semantic tokenization and autoregressive generation. The first stage involves item tokenization that constructs discrete semantic tokens to index items, while the second stage autoregressively generates semantic tokens of candidate items. Therefore, semantic tokenization serves as a crucial preliminary step for training generative recommendation models. Existing research usually employs a vector quantizier with reconstruction loss (e.g., RQ-VAE) to obtain semantic tokens of items, but this method fails to capture the essential neighborhood relationships that are vital for effective item modeling in recommender systems. In this paper, we propose a contrastive quantization-based semantic tokenization approach, named CoST, which harnesses both item relationships and semantic information to learn semantic tokens. Our experimental results highlight the significant impact of semantic tokenization on generative recommendation performance, with CoST achieving up to a 43% improvement in Recall@5 and 44% improvement in NDCG@5 on the MIND dataset over previous baselines.

References (21)

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a novel generative recommendation framework using contrastive quantization to effectively leverage semantic and relational item data.
It employs an enhanced RQVAE model with multi-stage quantization to convert pre-trained text embeddings into discrete semantic codes.
Experimental results on MIND and Amazon datasets show improvements up to 80.95% in key metrics, validating its practical impact on recommendation systems.

CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation

Introduction

In this research, a novel approach to generative recommendation is introduced through the "CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation". The primary thrust is to refine the construction of item semantic codes by leveraging both item relationships and semantic information. This is achieved by integrating contrastive learning within a generative retrieval framework. Traditional methods have primarily focused on the semantic content from item descriptions without adequately considering inter-item relationships, which are crucial for modeling recommendations. Utilizing a pre-trained LLM, the approach converts item textual descriptions into embeddings, which are then processed through an enhanced RQVAE model using contrastive objectives.

Figure 1: An Overview of Contrastive Quantization based Semantic Code Framework for Generative Recommendation.

Methodology

Semantic Code Construction

The semantic code construction is accomplished by using an RQVAE model, which sequentially quantizes item embeddings obtained from pre-trained text encoders like Sentence-T5 and BERT. This multi-stage process involves generating a tuple of codes using separate codebooks at each quantization level, mitigating conflicts and representing various complexities in the data.

Figure 2: The RQVAE model, the semantic reconstruction and our contrastive quantization.

Contrastive Learning

The application of contrastive learning aims to enhance item discrepancy by treating embeddings generated from the decoder as positive instances, while other generated samples serve as negative instances. This approach maximizes the mutual information between similar items and segregates unrelated items more effectively.

Experiments and Results

The experimental setup involved leveraging two large-scale datasets: MIND and Amazon's Office Product domain. The experiments showcased significant improvements in NDCG@5 by 43.76% on the MIND dataset and Recall@10 by 80.95% on the Office dataset compared to previous baselines. These improvements substantiate the effectiveness of incorporating item relationships into the code generation process, which traditional methods often overlook.

Figure 3: Analysis on tau and training epochs.

The results from sensitivity analyses around parameters like the temperature $\tau$ and the number of training epochs corroborate the robustness of the RQVAE model. The findings indicate that as training epochs increase and $\tau$ is optimized, there is a steady improvement in NDCG and Recall metrics.

Figure 4: Analysis on codebook size, number of codebooks, dim of embeddings.

Further analysis on the codebook parameters reveals that increasing the codebook size and embedding dimensions effectively enhances representational capabilities, leading to consistent performance gains in downstream tasks.

Conclusion

This work presents a significant step forward in enhancing the capabilities of generative recommendation systems by introducing a contrastive quantization method that integrates semantic and relational data. By demonstrating substantial improvements across various datasets, the paper highlights the potential for further exploration in code generation techniques, which may encompass richer contextual data and user behavior analysis. Future research may explore these aspects to push the boundaries of generative retrieval systems.

Overall, this approach marks a productive enhancement to existing generative retrieval paradigms, with promising implications for practical deployment in recommender systems.