Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 452 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

LLMs for User Interest Exploration in Large-scale Recommendation Systems (2405.16363v2)

Published 25 May 2024 in cs.IR and cs.AI

Abstract: Traditional recommendation systems are subject to a strong feedback loop by learning from and reinforcing past user-item interactions, which in turn limits the discovery of novel user interests. To address this, we introduce a hybrid hierarchical framework combining LLMs and classic recommendation models for user interest exploration. The framework controls the interfacing between the LLMs and the classic recommendation models through "interest clusters", the granularity of which can be explicitly determined by algorithm designers. It recommends the next novel interests by first representing "interest clusters" using language, and employs a fine-tuned LLM to generate novel interest descriptions that are strictly within these predefined clusters. At the low level, it grounds these generated interests to an item-level policy by restricting classic recommendation models, in this case a transformer-based sequence recommender to return items that fall within the novel clusters generated at the high level. We showcase the efficacy of this approach on an industrial-scale commercial platform serving billions of users. Live experiments show a significant increase in both exploration of novel interests and overall user enjoyment of the platform.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Palm 2 technical report. arXiv preprint arXiv:2305.10403.
  2. Tallrec: An effective and efficient tuning framework to align large language model with recommendation. arXiv preprint arXiv:2305.00447.
  3. Youtube Official Blog. 2023. YouTube by the Number. Retrieved January, 2023 from https://blog.youtube/press/
  4. Language models are few-shot learners. In NeurIPS.
  5. How algorithmic confounding in recommendation systems increases homogeneity and decreases utility. In RecSys.
  6. Cluster Anchor Regularization to Alleviate Popularity Bias in Recommender Systems. In Companion Proceedings of the ACM on Web Conference 2024. 151–160.
  7. Minmin Chen. 2021. Exploration in recommender systems. In RecSys.
  8. Top-k off-policy correction for a REINFORCE recommender system. In WSDM.
  9. Values of user exploration in recommender systems. In RecSys.
  10. Large language models for user interest journeys. arXiv preprint arXiv:2305.15498 (2023).
  11. Uncovering ChatGPT’s Capabilities in Recommender Systems. arXiv preprint arXiv:2305.02182.
  12. Gemini Team Google. 2023. Gemini: A family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).
  13. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In ResSys.
  14. VIP5: Towards Multimodal Foundation Models for Recommendation. arXiv preprint arXiv:2305.14302.
  15. James Hale. 2019. More Than 500 Hours Of Content Are Now Being Uploaded To YouTube Every Minute. Retrieved January, 2023 from https://www.tubefilter.com/2019/05/07/number-hours-video-uploaded-to-youtube-per-minute/
  16. Learning vector-quantized item representation for transferable sequential recommenders. In TheWebConf.
  17. Large language models are zero-shot rankers for recommender systems. ECIR.
  18. Tim Ingham. 2023. Over 60,000 Tracks are Now Uploaded to Spotify Every Day. That’s Nearly One per Second. Retrieved January, 2023 from https://www.musicbusinessworldwide.com/over-60000-tracks-are-now-uploaded-to-spotify-daily-thats-nearly-one-per-second/
  19. TagGPT: Large Language Models are Zero-shot Multimodal Taggers. arXiv preprint arXiv:2304.03022.
  20. Text Is All You Need: Learning Language Representations for Sequential Recommendation. In KDD.
  21. GPT4Rec: A generative framework for personalized recommendation and user interests interpretation. arXiv preprint arXiv:2304.03879.
  22. Rella: Retrieval-enhanced large language models for lifelong sequential behavior comprehension in recommendation. TheWebConf.
  23. Is chatgpt a good recommender? a preliminary study. arXiv preprint arXiv:2304.10149.
  24. A First Look at LLM-Powered Generative News Recommendation. arXiv preprint arXiv:2305.06566.
  25. PIE: Personalized Interest Exploration for Large-Scale Recommender Systems. In Companion Proceedings of the ACM Web Conference 2023. 508–512.
  26. Feedback loop and bias amplification in recommender systems. In CIKM.
  27. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155 (2018).
  28. Show me the whole world: Towards entire item space exploration for interactive personalized recommendations. In WSDM.
  29. Long-Term Value of Exploration: Measurements, Findings and Algorithms. In WSDM.
  30. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  31. Large Language Models as Data Augmenters for Cold-Start Item Recommendation. arXiv preprint arXiv:2402.11724 (2024).
  32. Result Diversification in Search and Recommendation: A Survey. TKDE (2024).
  33. Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models. arXiv preprint arXiv:2306.10933.
  34. Mixed negative sampling for learning two-tower neural networks in recommendations. In Companion Proceedings of the Web Conference 2020.
  35. Tiny-newsrec: Effective and efficient plm-based news recommendation. In EMNLP.
Citations (1)

Summary

  • The paper presents a hybrid hierarchical framework that leverages LLMs to predict novel interest clusters and expand user discovery, overcoming typical feedback loops.
  • It deploys a two-level recommendation system where high-level LLM policies generate interest clusters while low-level models ground recommendations to specific items.
  • Live experiments on a commercial platform demonstrate significant improvements in user exploration, content engagement, and overall user growth.

Summary of "LLMs for User Interest Exploration in Large-scale Recommendation Systems"

The paper presents an innovative framework for enhancing user interest exploration within large-scale recommendation systems by merging LLMs with established recommendation algorithms. This hybrid hierarchical framework seeks to circumvent the typical feedback loops found in recommendation systems, which prioritize previous user behavior over potentially novel content, thereby limiting the breadth of user discovery. Through the effective integration of LLMs, the framework generates novel interest clusters and confines classic recommendation models to recommend items within these clusters.

Hybrid Hierarchical Framework

High-Level Language Policy

The high-level language policy employs LLMs to examine a user's previous interactions, represented via interest clusters. By using cluster descriptions, LLMs predict new interests, enabling users to explore novel content based on their historical consumption. The LLM controls the granularity of recommendations and ensures predictions remain within predefined interest clusters. Figure 1

Figure 1: LLM-powered hybrid hierarchical planning diagram for user interest exploration.

Practical deployment involves pre-computed lookup operations, leveraging the clustering and inference strategy to handle vast item corpuses efficiently. This is achieved by reducing the effective recommendation space to cluster pairs, performing offline inference, and maintaining a lightweight online serving mechanism.

Low-Level Item Policy

This component grounds language-generated interests to individual items, utilizing existing recommendation models such as sequential transformer models coupled with restriction policies. These models return items within the generated interest clusters, ensuring recommendations remain relevant and personalized while enhancing user exploration.

Controlled generation is vital, allowing LLMs to produce recommendations that align with cluster definitions effectively. Fine-tuning facilitates a balance between global knowledge and user-specific data, crucial for aligning LLM outputs with actual user behaviors.

Fine-Tuning for User Behavior Alignment

Fine-tuning LLMs with real user interaction data targets controlled generation, ensuring outputs match specific interest clusters and align with user behaviors. This involves diverse data curation to achieve balanced clusters and mitigate biased generation frequencies. Figure 2

Figure 2: Prompt for Novel Interest Prediction when K=2.

Figure 3

Figure 3: Label (i.e., generated by fine-tuned LLM) Distribution: X-axis represents label frequency; Y-axis represents the percentage of labels within each frequency range.

Control generation and behavioral alignment are assessed through metrics such as match rate and recall, ensuring fine-tuned models generate relevant outputs consistent with user interests, aligning experimental results with observed user behaviors.

Live Experiments

Experiments conducted on a commercial recommendation platform demonstrate significant gains in user exploration and content engagement metrics. The framework effectively expands user interests, facilitating longer platform engagement and increased user growth. Figure 4

Figure 4

Figure 4

Figure 4: (a) Model Finetuning Process. (b) and (c) Comparison between different recommenders in live experiments.

Figure 5

Figure 5

Figure 5: The proposed method drives user growth.

Novelty and Quality

The proposed method excels in introducing novel content while maintaining recommendation quality, outperforming existing exploration-oriented models. Novelty and user engagement metrics confirm its success in diversifying user consumption patterns.

User Interest Exploration

The framework significantly increases the diversity of user interest consumption (UCI metric), encouraging exploration across various interest levels, which demonstrates its efficacy in broadening user discovery.

User Growth

Real-world user engagement metrics reveal enhanced platform usage, attributed to the novel, high-quality recommendations provided by the proposed framework. Significant increases in active users further affirm its potential in driving growth.

Conclusion

This hybrid hierarchical framework offers a methodical approach to integrate LLMs within large-scale recommendation systems, overcoming traditional limitations by facilitating novel interest exploration. The methodology provides a seamless user experience with implications for broadening content engagement and enhancing user satisfaction. Future improvements will explore long-term planning considerations to further refine hierarchical recommendation processes.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.