Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 43 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 455 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation (2404.14774v2)

Published 23 Apr 2024 in cs.IR

Abstract: Embedding-based retrieval serves as a dominant approach to candidate item matching for industrial recommender systems. With the success of generative AI, generative retrieval has recently emerged as a new retrieval paradigm for recommendation, which casts item retrieval as a generation problem. Its model consists of two stages: semantic tokenization and autoregressive generation. The first stage involves item tokenization that constructs discrete semantic tokens to index items, while the second stage autoregressively generates semantic tokens of candidate items. Therefore, semantic tokenization serves as a crucial preliminary step for training generative recommendation models. Existing research usually employs a vector quantizier with reconstruction loss (e.g., RQ-VAE) to obtain semantic tokens of items, but this method fails to capture the essential neighborhood relationships that are vital for effective item modeling in recommender systems. In this paper, we propose a contrastive quantization-based semantic tokenization approach, named CoST, which harnesses both item relationships and semantic information to learn semantic tokens. Our experimental results highlight the significant impact of semantic tokenization on generative recommendation performance, with CoST achieving up to a 43% improvement in Recall@5 and 44% improvement in NDCG@5 on the MIND dataset over previous baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. M6-rec: Generative pretrained language models are open-ended recommender systems. arXiv preprint arXiv:2205.08084 (2022).
  2. Autoregressive entity retrieval. arXiv preprint arXiv:2010.00904 (2020).
  3. Transformers4rec: Bridging the gap between nlp and sequential/session-based recommendation. In Proceedings of the 15th ACM Conference on Recommender Systems. 143–153.
  4. Recommender Forest for Efficient Retrieval. Advances in Neural Information Processing Systems 35 (2022), 38912–38924.
  5. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems. 299–315.
  6. Learning Vector-Quantized Item Representation for Transferable Sequential Recommenders. In TheWebConf.
  7. How to index item ids for recommendation foundation models. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 195–204.
  8. Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117–128. https://doi.org/10.1109/TPAMI.2010.57
  9. Autoregressive image generation using residual quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11523–11532.
  10. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. arXiv preprint arXiv:2108.08877 (2021).
  11. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
  12. Recommender systems with generative retrieval. Advances in Neural Information Processing Systems 36 (2024).
  13. Better Generalization with Semantic IDs: A case study in Ranking for Recommendations. arXiv preprint arXiv:2306.08121 (2023).
  14. Contrastive distillation on intermediate representations for language model compression. arXiv preprint arXiv:2009.14167 (2020).
  15. Sequence to sequence learning with neural networks. Advances in neural information processing systems 27 (2014).
  16. Transformer memory as a differentiable search index. Advances in Neural Information Processing Systems 35 (2022), 21831–21843.
  17. Neural discrete representation learning. Advances in neural information processing systems 30 (2017).
  18. Attention is all you need. Advances in neural information processing systems 30 (2017).
  19. A neural corpus indexer for document retrieval. Advances in Neural Information Processing Systems 35 (2022), 25600–25614.
  20. Mind: A large-scale dataset for news recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3597–3606.
  21. Soundstream: An end-to-end neural audio codec. IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2021), 495–507.
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces a novel generative recommendation framework using contrastive quantization to effectively leverage semantic and relational item data.
  • It employs an enhanced RQVAE model with multi-stage quantization to convert pre-trained text embeddings into discrete semantic codes.
  • Experimental results on MIND and Amazon datasets show improvements up to 80.95% in key metrics, validating its practical impact on recommendation systems.

CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation

Introduction

In this research, a novel approach to generative recommendation is introduced through the "CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation". The primary thrust is to refine the construction of item semantic codes by leveraging both item relationships and semantic information. This is achieved by integrating contrastive learning within a generative retrieval framework. Traditional methods have primarily focused on the semantic content from item descriptions without adequately considering inter-item relationships, which are crucial for modeling recommendations. Utilizing a pre-trained LLM, the approach converts item textual descriptions into embeddings, which are then processed through an enhanced RQVAE model using contrastive objectives. Figure 1

Figure 1: An Overview of Contrastive Quantization based Semantic Code Framework for Generative Recommendation.

Methodology

Semantic Code Construction

The semantic code construction is accomplished by using an RQVAE model, which sequentially quantizes item embeddings obtained from pre-trained text encoders like Sentence-T5 and BERT. This multi-stage process involves generating a tuple of codes using separate codebooks at each quantization level, mitigating conflicts and representing various complexities in the data. Figure 2

Figure 2: The RQVAE model, the semantic reconstruction and our contrastive quantization.

Contrastive Learning

The application of contrastive learning aims to enhance item discrepancy by treating embeddings generated from the decoder as positive instances, while other generated samples serve as negative instances. This approach maximizes the mutual information between similar items and segregates unrelated items more effectively.

Experiments and Results

The experimental setup involved leveraging two large-scale datasets: MIND and Amazon's Office Product domain. The experiments showcased significant improvements in NDCG@5 by 43.76% on the MIND dataset and Recall@10 by 80.95% on the Office dataset compared to previous baselines. These improvements substantiate the effectiveness of incorporating item relationships into the code generation process, which traditional methods often overlook. Figure 3

Figure 3: Analysis on tau and training epochs.

The results from sensitivity analyses around parameters like the temperature τ\tau and the number of training epochs corroborate the robustness of the RQVAE model. The findings indicate that as training epochs increase and τ\tau is optimized, there is a steady improvement in NDCG and Recall metrics. Figure 4

Figure 4: Analysis on codebook size, number of codebooks, dim of embeddings.

Further analysis on the codebook parameters reveals that increasing the codebook size and embedding dimensions effectively enhances representational capabilities, leading to consistent performance gains in downstream tasks.

Conclusion

This work presents a significant step forward in enhancing the capabilities of generative recommendation systems by introducing a contrastive quantization method that integrates semantic and relational data. By demonstrating substantial improvements across various datasets, the paper highlights the potential for further exploration in code generation techniques, which may encompass richer contextual data and user behavior analysis. Future research may explore these aspects to push the boundaries of generative retrieval systems.

Overall, this approach marks a productive enhancement to existing generative retrieval paradigms, with promising implications for practical deployment in recommender systems.