- The paper presents CF-GNN, a framework integrating conformal prediction with graph neural networks to provide guaranteed uncertainty quantification.
- It introduces a topology-aware output correction model that refines predictions using network structure, reducing prediction set sizes by up to 74%.
- Extensive experiments across 15 datasets demonstrate that CF-GNN achieves target marginal coverage and strong empirical conditional coverage in node classification and regression.
The paper "Uncertainty Quantification over Graph with Conformalized Graph Neural Networks" (2305.14535) introduces CF-GNN, a novel framework that extends conformal prediction to GNNs, enabling rigorous uncertainty quantification for graph-structured data. This approach provides prediction sets or intervals with a guaranteed coverage probability, addressing a critical gap in GNN deployment where the cost of errors is significant.
Addressing Exchangeability in Graph Data
A key contribution of the paper lies in establishing the validity of conformal prediction for graphs in transductive settings. The paper demonstrates that standard conformal prediction remains valid if the non-conformity score is invariant to the ordering of calibration and test samples, a condition readily satisfied by many GNN models. This permutation invariance enables the application of conformal prediction to GNNs without compromising statistical guarantees.
Topology-Aware Output Correction Model
To enhance the efficiency of conformal prediction, the authors propose a topology-aware correction model that learns to update predictions based on network structure. This model leverages the observed correlation between non-conformity scores and network topology to refine predictions and reduce the size of prediction sets or the length of prediction intervals. The correction model is trained by minimizing a differentiable inefficiency loss that simulates the CP set sizes or interval lengths, aligning with the theoretical framework of graph exchangeability to ensure valid coverage guarantees.
The paper presents extensive experimental results across 15 datasets for both node classification and regression tasks. CF-GNN consistently achieves pre-defined target marginal coverage, outperforming existing UQ methods that often fail to meet coverage guarantees. Furthermore, CF-GNN significantly reduces the prediction set sizes or interval lengths by up to 74% compared to direct application of conformal prediction to GNNs. The method also demonstrates strong empirical conditional coverage over various network features.
Implications and Future Directions
The CF-GNN framework offers a practical approach to uncertainty quantification in GNNs, providing statistically sound and efficient prediction sets or intervals. By addressing the challenges of exchangeability in graph data and incorporating topology-aware corrections, this research advances the reliable deployment of GNNs in critical applications. Future research directions include generalizing the inefficiency loss to other desirable CP properties such as robustness and conditional coverage, extensions to inductive settings or transductive but non-random split settings, and extensions to other graph tasks such as link prediction, community detection, and so on.