Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 133 tok/s
Gemini 3.0 Pro 55 tok/s Pro
Gemini 2.5 Flash 164 tok/s Pro
Kimi K2 202 tok/s Pro
Claude Sonnet 4.5 39 tok/s Pro
2000 character limit reached

Delayed Feedback Modeling for the Entire Space Conversion Rate Prediction (2011.11826v1)

Published 24 Nov 2020 in cs.LG

Abstract: Estimating post-click conversion rate (CVR) accurately is crucial in E-commerce. However, CVR prediction usually suffers from three major challenges in practice: i) data sparsity: compared with impressions, conversion samples are often extremely scarce; ii) sample selection bias: conventional CVR models are trained with clicked impressions while making inference on the entire space of all impressions; iii) delayed feedback: many conversions can only be observed after a relatively long and random delay since clicks happened, resulting in many false negative labels during training. Previous studies mainly focus on one or two issues while ignoring the others. In this paper, we propose a novel neural network framework ESDF to tackle the above three challenges simultaneously. Unlike existing methods, ESDF models the CVR prediction from a perspective of entire space, and combines the advantage of user sequential behavior pattern and the time delay factor. Specifically, ESDF utilizes sequential behavior of user actions on the entire space with all impressions to alleviate the sample selection bias problem. By sharing the embedding parameters between CTR and CVR networks, data sparsity problem is greatly relieved. Different from conventional delayed feedback methods, ESDF does not make any special assumption about the delay distribution. We discretize the delay time by day slot and model the probability based on survival analysis with deep neural network, which is more practical and suitable for industrial situations. Extensive experiments are conducted to evaluate the effectiveness of our method. To the best of our knowledge, ESDF is the first attempt to unitedly solve the above three challenges in CVR prediction area.

Citations (18)

Summary

  • The paper introduces a novel ESDF framework that jointly addresses data sparsity, sample selection bias, and delayed feedback in CVR prediction.
  • It leverages user sequential behavior and shared embeddings to couple CTR, CVR, and CTCVR tasks using multi-task learning.
  • Experimental results show significant improvements (GAUC +0.08, RelaImpr +6.68%) over benchmarks, validating its practical utility in E-commerce.

Delayed Feedback Modeling for the Entire Space Conversion Rate Prediction

The paper "Delayed Feedback Modeling for the Entire Space Conversion Rate Prediction" addresses the critical challenges faced by conversion rate (CVR) prediction in large-scale E-commerce platforms. These challenges include data sparsity, sample selection bias, and delayed feedback. The paper introduces a novel neural network framework named ESDF that aims to tackle these challenges cohesively. It utilizes user sequential behavior and time delay modeling without assuming specific delay distributions.

Introduction to CVR Challenges

CVR prediction is pivotal for optimizing E-commerce search and recommendation systems. However, it faces three primary challenges:

  1. Data Sparsity: CVR models are trained on clicked impressions, which are significantly fewer than non-clicked impressions, leading to sparse data.
  2. Sample Selection Bias: Training data is collected from clicked impressions, creating a mismatch during testing when predictions must be made on all impressions.
  3. Delayed Feedback: Many conversions occur after significant delays following clicks, resulting in mislabelled negative samples during model training.

ESDF Framework

The ESDF framework models CVR prediction across the entire space, addressing all three challenges simultaneously. It achieves this through:

  • Sequential Behavior: ESDF leverages the sequential action patterns of users, such as "impression → click → purchase," to reduce sample selection bias.
  • Shared Embeddings: By sharing embedding layers between Click-Through Rate (CTR) and CVR networks, ESDF alleviates data sparsity issues.
  • Time Delay Model: ESDF employs survival analysis for modeling time delayed conversions, discretizing delay times into daily slots without assuming a specific delay distribution. Figure 1

    Figure 1: Delayed conversion distribution.

Conversion Model

The conversion model focuses on predicting the probability of conversion given that a click has occurred. ESDF constructs auxiliary tasks for CTR and Click-Through Conversion Rate (CTCVR) prediction, assisting with sample selection bias. The framework calculates the probability of CVR with a multi-task learning approach, incorporating all impression data, thus alleviating data sparsity.

Time Delay Model

The time delay model uses survival analysis principles to model conversion delays without assuming exponential or Weibull distributions. It discretizes conversion delays into bins for more accurate and practical application in industrial environments, addressing industry feedback challenges. Figure 2

Figure 2

Figure 2: Log loss of delayed feedback samples.

Experimental Results

The framework demonstrates significant improvements over existing models on public and production datasets. ESDF achieved a GAUC improvement of approximately 0.08 and a relative improvement (RelaImpr) of 6.68% compared to strong baselines like DFM and ESMM. These results validate ESDF's approach, particularly regarding time delay modeling without distribution assumptions.

Further analysis shows ESDF performs robustly under delayed feedback, maintaining prediction precision across varied delays, outperforming other models, including NAIVE and SHIFT.

Conclusion

ESDF presents a cohesive solution to longstanding CVR prediction challenges. By integrating survival analysis and addressing data sparsity, sample selection bias, and delayed feedback, it sets a new standard within E-commerce CVR modeling. The open-source dataset released with this research provides valuable resources for future studies in CVR prediction challenges.

More systematic methods to tackle these challenges from a unified perspective need further investigation, setting a potential trajectory for future research in conversion rate prediction modeling.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.