Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models (2404.15625v1)

Published 24 Apr 2024 in cs.LG

Abstract: The open-world test dataset is often mixed with out-of-distribution (OOD) samples, where the deployed models will struggle to make accurate predictions. Traditional detection methods need to trade off OOD detection and in-distribution (ID) classification performance since they share the same representation learning model. In this work, we propose to detect OOD molecules by adopting an auxiliary diffusion model-based framework, which compares similarities between input molecules and reconstructed graphs. Due to the generative bias towards reconstructing ID training samples, the similarity scores of OOD molecules will be much lower to facilitate detection. Although it is conceptually simple, extending this vanilla framework to practical detection applications is still limited by two significant challenges. First, the popular similarity metrics based on Euclidian distance fail to consider the complex graph structure. Second, the generative model involving iterative denoising steps is time-consuming especially when it runs on the enormous pool of drugs. To address these challenges, our research pioneers an approach of Prototypical Graph Reconstruction for Molecular OOD Detection, dubbed as PGR-MOOD and hinges on three innovations: i) An effective metric to comprehensively quantify the matching degree of input and reconstructed molecules; ii) A creative graph generator to construct prototypical graphs that are in line with ID but away from OOD; iii) An efficient and scalable OOD detector to compare the similarity between test samples and pre-constructed prototypical graphs and omit the generative process on every new molecule. Extensive experiments on ten benchmark datasets and six baselines are conducted to demonstrate our superiority.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Equivariant subgraph aggregation networks. arXiv preprint arXiv:2110.02910 (2021).
  2. Line graph neural networks for link prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2021), 5103–5113.
  3. Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling. (2023).
  4. Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
  5. How powerful are k-hop message passing graph neural networks. Advances in Neural Information Processing Systems 35 (2022), 4776–4790.
  6. DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1579–1589.
  7. Neural message passing for quantum chemistry. In International conference on machine learning. PMLR, 1263–1272.
  8. Denoising diffusion models for out-of-distribution detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2947–2956.
  9. Good: A graph out-of-distribution benchmark. Advances in Neural Information Processing Systems 35 (2022), 2059–2073.
  10. A Data-centric Framework to Endow Graph Neural Networks with Out-Of-Distribution Detection Ability. (2023).
  11. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).
  12. Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. Advances in Neural Information Processing Systems 34 (2021), 14239–14251.
  13. Dan Hendrycks and Kevin Gimpel. 2016. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136 (2016).
  14. Graphs in molecular biology. BMC bioinformatics 8, 6 (2007), 1–14.
  15. Generative models for graph-based protein design. Advances in neural information processing systems 32 (2019).
  16. Drugood: Out-of-distribution dataset curator and benchmark for ai-aided drug discovery–a focus on affinity prediction problems with noise annotations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 8023–8031.
  17. Score-based generative modeling of graphs via the system of stochastic differential equations. In International Conference on Machine Learning. PMLR, 10362–10383.
  18. Thomas N Kipf and Max Welling. 2016. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
  19. Maksim Kuznetsov and Daniil Polykovskiy. 2021. MolGrow: A graph normalizing flow for hierarchical molecular generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 8226–8234.
  20. Learning graph-level representation for drug discovery. arXiv preprint arXiv:1709.03741 (2017).
  21. Multi-objective de novo drug design with conditional graph generative model. Journal of cheminformatics 10 (2018), 1–24.
  22. Graphde: A generative framework for debiased learning and out-of-distribution detection on graphs. Advances in Neural Information Processing Systems 35 (2022), 30277–30290.
  23. Sagess: Sampling graph denoising diffusion model for scalable graph generation. arXiv preprint arXiv:2306.16827 (2023).
  24. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nature Chemical Biology (2023), 1–9.
  25. Data-Centric Learning from Unlabeled Graphs with Diffusion Model. arXiv preprint arXiv:2303.10108 (2023).
  26. Good-d: On unsupervised graph out-of-distribution detection. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 339–347.
  27. Unsupervised Out-of-Distribution Detection with Diffusion Inpainting. arXiv preprint arXiv:2302.10326 (2023).
  28. Deep graph-level anomaly detection by glocal knowledge distillation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 704–714.
  29. A graph vae and graph transformer approach to generating molecular graphs. arXiv preprint arXiv:2104.04345 (2021).
  30. Permutation invariant graph generation via score-based generative modeling. In International Conference on Artificial Intelligence and Statistics. PMLR, 4474–4484.
  31. Raising the bar in graph-level anomaly detection. arXiv preprint arXiv:2205.13845 (2022).
  32. Graph neural networks for materials science and chemistry. Communications Materials 3, 1 (2022), 93.
  33. Deep one-class classification. In International conference on machine learning. PMLR, 4393–4402.
  34. Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems 34 (2021), 1415–1428.
  35. Optimal transport for structured data with application on graphs. In International Conference on Machine Learning. PMLR, 6275–6284.
  36. DiGress: Discrete Denoising diffusion for graph generation. In The Eleventh International Conference on Learning Representations.
  37. Template based graph neural network with optimal transport distances. Advances in Neural Information Processing Systems 35 (2022), 11800–11814.
  38. Be confident! towards trustworthy graph neural networks via confidence calibration. Advances in Neural Information Processing Systems 34 (2021), 23768–23779.
  39. Molecular contrastive learning of representations via graph neural networks. Nature Machine Intelligence 4, 3 (2022), 279–287.
  40. Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=Od39h4XQ3Y
  41. A compact review of molecular property prediction with graph neural networks. Drug Discovery Today: Technologies 37 (2020), 1–12.
  42. A survey of trustworthy graph learning: Reliability, explainability, and privacy protection. arXiv preprint arXiv:2205.10014 (2022).
  43. Energy-based out-of-distribution detection for graph neural networks. arXiv preprint arXiv:2302.02914 (2023).
  44. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).
  45. From stars to subgraphs: Uplifting any GNN with local structure awareness. arXiv preprint arXiv:2110.03753 (2021).
  46. Towards deeper graph neural networks with differentiable group normalization. Advances in neural information processing systems 33 (2020), 4917–4928.
  47. A survey on deep graph generation: Methods and applications. In Learning on Graphs Conference. PMLR, 47–1.
Citations (3)

Summary

We haven't generated a summary for this paper yet.