Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection (2005.00625v3)

Published 1 May 2020 in cs.SI, cs.IR, and cs.LG

Abstract: The graph-based model can help to detect suspicious fraud online. Owing to the development of Graph Neural Networks~(GNNs), prior research work has proposed many GNN-based fraud detection frameworks based on either homogeneous graphs or heterogeneous graphs. These work follow the existing GNN framework by aggregating the neighboring information to learn the node embedding, which lays on the assumption that the neighbors share similar context, features, and relations. However, the inconsistency problem is hardly investigated, i.e., the context inconsistency, feature inconsistency, and relation inconsistency. In this paper, we introduce these inconsistencies and design a new GNN framework, $\mathsf{GraphConsis}$, to tackle the inconsistency problem: (1) for the context inconsistency, we propose to combine the context embeddings with node features, (2) for the feature inconsistency, we design a consistency score to filter the inconsistent neighbors and generate corresponding sampling probability, and (3) for the relation inconsistency, we learn a relation attention weights associated with the sampled nodes. Empirical analysis on four datasets indicates the inconsistency problem is crucial in a fraud detection task. The extensive experiments prove the effectiveness of $\mathsf{GraphConsis}$. We also released a GNN-based fraud detection toolbox with implementations of SOTA models. The code is available at https://github.com/safe-graph/DGFraud.

Citations (231)

View on Semantic Scholar

Summary

The paper introduces GraphConsis, a novel framework that resolves context, feature, and relation inconsistencies in GNN-based fraud detection.
The method leverages context embedding, neighbor sampling, and relation attention to filter and prioritize relevant node information.
Experimental results on the YelpChi dataset show enhanced F1-score and AUC, particularly with limited training data.

Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection

The paper addresses the application of Graph Neural Networks (GNNs) to fraud detection, identifying and tackling a critical issue termed the "inconsistency problem". This inconsistency arises across three dimensions: context, feature, and relation. The researchers introduced a novel GNN framework, GraphConsis, designed specifically to mitigate these inconsistencies and enhance the effectiveness of fraud detection.

Inconsistency Problems in GNN-based Fraud Detection

The traditional assumption in GNN-based fraud detection is that neighboring nodes exhibit similar attributes and labels. However, this assumption frequently fails in fraud scenarios due to three observed inconsistencies:

Context Inconsistency: Fraudsters may connect to legitimate entities to obscure their activities. This inconsistency manifests when aggregating neighborhood information leads to mixed or misleading representations that can hinder accurate fraud detection.
Feature Inconsistency: Connected nodes may have significantly different features, as fraudsters often operate across disparate contexts. For instance, reviews from the same user on different product categories exhibit varied features, complicating direct feature aggregation.
Relation Inconsistency: Entities are connected through multiple types of relations, each with varying degrees of relevance to fraud detection. For example, connections based on common users might be more indicative of fraudulent activities than those based on common products.

The GraphConsis Framework

GraphConsis remedies these issues by incorporating three novel mechanisms:

Context Embedding: To address context inconsistency, a trainable context embedding augments each node's feature vector, capturing local structural information and aiding in fraud detection.
Neighbor Sampling: A consistency score, based on feature similarity, filters neighbors to ensure that only relevant information is aggregated, thereby addressing feature inconsistency.
Relation Attention: By learning attention weights for different relations, GraphConsis effectively handles relation inconsistency, prioritizing more indicative relational data during aggregation.

Experimental Results

The researchers conducted experiments on the YelpChi dataset, which comprises various spam detection tasks, to evaluate the performance of GraphConsis against baseline methods including Logistic Regression, GraphSAGE, and Player2Vec. The results demonstrated superior performance of GraphConsis in terms of both F1-score and AUC, particularly with smaller training datasets. This improvement underscores the importance of addressing the inconsistency problem, which traditional GNN methods fail to mitigate effectively.

Implications and Future Directions

GraphConsis marks a significant advancement in applying GNNs to fraud detection by introducing mechanisms that explicitly tackle inconsistencies. This approach not only enhances detection accuracy but also establishes a framework that can be adapted to other graph-based learning tasks characterized by similar inconsistencies.

Future research directions include refining the neighbor sampling process to adaptively set thresholds for each relation, thereby optimizing the GNN's receptive field. Furthermore, extending this approach to a diverse range of datasets and fraud scenarios could reveal additional insights into the nuanced behavior of fraudsters and the robustness required of GNN frameworks.

Overall, GraphConsis contributes a meaningful methodology to the field of GNN-based fraud detection, suggesting pathways for enhanced analytical frameworks and real-world applications.

PDF Markdown

Related Papers

GitHub

GitHub - safe-graph/DGFraud: A Deep Graph-based Toolbox for Fraud Detection (700 stars)