Freshness or Accuracy, Why Not Both? Addressing Delayed Feedback via Dynamic Graph Neural Networks (2308.08071v1)

Published 15 Aug 2023 in cs.LG and cs.AI

Abstract: The delayed feedback problem is one of the most pressing challenges in predicting the conversion rate since users' conversions are always delayed in online commercial systems. Although new data are beneficial for continuous training, without complete feedback information, i.e., conversion labels, training algorithms may suffer from overwhelming fake negatives. Existing methods tend to use multitask learning or design data pipelines to solve the delayed feedback problem. However, these methods have a trade-off between data freshness and label accuracy. In this paper, we propose Delayed Feedback Modeling by Dynamic Graph Neural Network (DGDFEM). It includes three stages, i.e., preparing a data pipeline, building a dynamic graph, and training a CVR prediction model. In the model training, we propose a novel graph convolutional method named HLGCN, which leverages both high-pass and low-pass filters to deal with conversion and non-conversion relationships. The proposed method achieves both data freshness and label accuracy. We conduct extensive experiments on three industry datasets, which validate the consistent superiority of our method.

Summary

The paper introduces DGDFEM, a dynamic graph approach that addresses delayed feedback in conversion rate prediction tasks.
It employs a real-time data pipeline and adaptive dynamic graph construction using HLGCN to capture evolving user-item interactions.
Experimental results demonstrate improved AUC and reduced NLL, underlining DGDFEM’s effectiveness in online advertising systems.

Freshness or Accuracy, Why Not Both? Addressing Delayed Feedback via Dynamic Graph Neural Networks

This paper introduces a novel approach to address the delayed feedback problem in conversion rate (CVR) prediction tasks using a dynamic graph neural network (DGNN). This problem is prominent in online commercial systems where user conversion feedback is inherently delayed, challenging models that strive to balance data freshness with label accuracy.

Delayed Feedback Problem

The intrinsic challenge of delayed feedback involves predicting whether user interactions such as clicks will eventually convert, despite conversion data arriving at later times. Current solutions either employ multitask learning or sophisticated data pipelines, each compromising between the immediacy of data and the accuracy of labels. The proposed method, DGDFEM, leverages a dynamic graph approach to encapsulate these delayed interactions into a coherent and adaptable model structure.

Proposed Methodology: DGDFEM

The DGDFEM framework is structured into three distinct stages:

Data Pipeline Preparation: The novel data pipeline maximizes both data freshness and label accuracy. Samples are continuously delivered as they appear, initially unlabeled, and subsequently marked and re-delivered based on conversion events within a preset attribution window.
Dynamic Graph Construction: Constructing a dynamic graph allows the model to adaptively represent the data's temporal nature, capturing evolving user-item interactions effectively. Nodes in the graph represent users and items, while edges encode interaction attributes derived from sample labels.
Model Training: Utilizing the sampled dynamic graph, the model processes multi-hop neighbors using the proposed HLGCN method, which employs both high-pass and low-pass filters for effective graph convolution. These filters distinguish between conversion (commonality retention) and non-conversion (difference amplification) relationships, enhancing the model's predictive accuracy.

HLGCN: High-Low Pass Filtered Graph Convolution

HLGCN is pivotal in DGDFEM, handling the complexity of conversion relationships through dynamic filtering. By using ConvE to estimate user preferences, HLGCN accurately applies high-pass or low-pass filters to graph edges. This dual-filter approach captures nuanced interaction patterns essential for reliable CVR predictions.

Experimental Results

Extensive empirical evaluation on industry datasets demonstrates the superiority of DGDFEM over existing methods. Compared with static and dynamic alternatives, DGDFEM consistently achieves higher accuracy (AUC) and lower error (NLL), validating its dual-focus approach on data freshness and label accuracy.

Implementation and Implications

DGDFEM leverages real-time data distribution updates, critical for online advertising systems demanding immediacy and precision in adaptive learning. The dynamic graph construction endows systems with robustness against delayed feedback penalties, ensuring accurate CVR insights that drive effective budget allocation and strategy formulation in real-time commerce environments.

Conclusion

The DGDFEM framework offers a compelling solution to the delayed feedback issue, harnessing dynamic graph structures and advanced graph filtering techniques. Its integration into real-world systems promises significant advances in the adaptability and accuracy of predictive models in rapidly changing user interaction landscapes.