LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation (2309.15363v2)
Abstract: Multimedia recommendation aims to predict users' future behaviors based on observed behaviors and item content information. However, the inherent noise contained in observed behaviors easily leads to suboptimal recommendation performance. Recently, the diffusion model's ability to generate information from noise presents a promising solution to this issue, prompting us to explore its application in multimedia recommendation. Nonetheless, several challenges must be addressed: 1) The diffusion model requires simplification to meet the efficiency requirements of real-time recommender systems, 2) The generated behaviors must align with user preference. To address these challenges, we propose a Light Diffusion model for Multimedia Recommendation (LD4MRec). LD4MRec largely reduces computational complexity by employing a forward-free inference strategy, which directly predicts future behaviors from observed noisy behaviors. Meanwhile, to ensure the alignment between generated behaviors and user preference, we propose a novel Conditional neural Network (C-Net). C-Net achieves guided generation by leveraging two key signals, collaborative signals and personalized modality preference signals, thereby improving the semantic consistency between generated behaviors and user preference. Experiments conducted on three real-world datasets demonstrate the effectiveness of LD4MRec.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.