Over-Squashing in Graph Neural Networks: A Comprehensive survey (2308.15568v6)
Abstract: Graph Neural Networks (GNNs) revolutionize machine learning for graph-structured data, effectively capturing complex relationships. They disseminate information through interconnected nodes, but long-range interactions face challenges known as "over-squashing". This survey delves into the challenge of over-squashing in Graph Neural Networks (GNNs), where long-range information dissemination is hindered, impacting tasks reliant on intricate long-distance interactions. It comprehensively explores the causes, consequences, and mitigation strategies for over-squashing. Various methodologies are reviewed, including graph rewiring, novel normalization, spectral analysis, and curvature-based strategies, with a focus on their trade-offs and effectiveness. The survey also discusses the interplay between over-squashing and other GNN limitations, such as over-smoothing, and provides a taxonomy of models designed to address these issues in node and graph-level tasks. Benchmark datasets for performance evaluation are also detailed, making this survey a valuable resource for researchers and practitioners in the GNN field.
- S. Ranshous, S. Shen, D. Koutra, S. Harenberg, C. Faloutsos, and N. F. Samatova, “Anomaly detection in dynamic networks: a survey,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 7, no. 3, pp. 223–247, 2015.
- J. Leskovec and J. Mcauley, “Learning to discover social circles in ego networks,” Advances in neural information processing systems, vol. 25, 2012.
- M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” Advances in neural information processing systems, vol. 29, 2016.
- J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International conference on machine learning. PMLR, 2017, pp. 1263–1272.
- W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.
- Z. Chen, X. Li, and J. Bruna, “Supervised community detection with line graph neural networks,” arXiv preprint arXiv:1705.08415, 2017.
- S. Min, Z. Gao, J. Peng, L. Wang, K. Qin, and B. Fang, “Stgsn—a spatial–temporal graph neural network framework for time-evolving social networks,” Knowledge-Based Systems, vol. 214, p. 106746, 2021.
- Y. Wang, Y. Zhao, Y. Zhang, and T. Derr, “Collaboration-aware graph convolutional networks for recommendation systems,” arXiv preprint arXiv:2207.06221, 2022.
- C. Gao, X. Wang, X. He, and Y. Li, “Graph neural networks for recommender system,” in Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 2022, pp. 1623–1625.
- Y. Chu, J. Yao, C. Zhou, and H. Yang, “Graph neural networks in modern recommender systems,” Graph Neural Networks: Foundations, Frontiers, and Applications, pp. 423–445, 2022.
- H. Chen, C.-C. M. Yeh, F. Wang, and H. Yang, “Graph neural transport networks with non-local attentions for recommender systems,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 1955–1964.
- Y. Yan, G. Li et al., “Size generalizability of graph neural networks on biological data: Insights and practices from the spectral perspective,” arXiv preprint arXiv:2305.15611, 2023.
- B. Jing, S. Eismann, P. N. Soni, and R. O. Dror, “Equivariant graph neural networks for 3d macromolecular structure,” arXiv preprint arXiv:2106.03843, 2021.
- H. Yuan, H. Yu, S. Gui, and S. Ji, “Explainability in graph neural networks: A taxonomic survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 5, pp. 5782–5799, 2022.
- J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI open, vol. 1, pp. 57–81, 2020.
- Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, “A comprehensive survey on graph neural networks,” IEEE transactions on neural networks and learning systems, vol. 32, no. 1, pp. 4–24, 2020.
- K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018.
- R. Sato, “A survey on the expressive power of graph neural networks,” arXiv preprint arXiv:2003.04078, 2020.
- S. Xiao, S. Wang, Y. Dai, and W. Guo, “Graph neural networks in node classification: survey and evaluation,” Machine Vision and Applications, vol. 33, pp. 1–19, 2022.
- H. Park and J. Neville, “Exploiting interaction links for node classification with deep graph neural networks.” in IJCAI, vol. 2019, 2019, pp. 3223–3230.
- Y. Wang, J. Jin, W. Zhang, Y. Yu, Z. Zhang, and D. Wipf, “Bag of tricks for node classification with graph neural networks,” arXiv preprint arXiv:2103.13355, 2021.
- C. Qiu, Z. Huang, W. Xu, and H. Li, “Vgaer: graph neural network reconstruction based community detection,” arXiv preprint arXiv:2201.04066, 2022.
- O. Wieder, S. Kohlbacher, M. Kuenemann, A. Garon, P. Ducrot, T. Seidel, and T. Langer, “A compact review of molecular property prediction with graph neural networks,” Drug Discovery Today: Technologies, vol. 37, pp. 1–12, 2020.
- Z. Wang, M. Liu, Y. Luo, Z. Xu, Y. Xie, L. Wang, L. Cai, Q. Qi, Z. Yuan, T. Yang et al., “Advanced graph and sequence neural networks for molecular property prediction and drug discovery,” Bioinformatics, vol. 38, no. 9, pp. 2579–2586, 2022.
- X. Wu, Z. Chen, W. Wang, and A. Jadbabaie, “A non-asymptotic analysis of oversmoothing in graph neural networks,” arXiv preprint arXiv:2212.10701, 2022.
- T. K. Rusch, M. M. Bronstein, and S. Mishra, “A survey on oversmoothing in graph neural networks,” arXiv preprint arXiv:2303.10993, 2023.
- D. Lukovnikov, J. Lehmann, and A. Fischer, “Improving the long-range performance of gated graph neural networks,” arXiv preprint arXiv:2007.09668, 2020.
- Z. Wu, P. Jain, M. Wright, A. Mirhoseini, J. E. Gonzalez, and I. Stoica, “Representing long-range context for graph neural networks with global attention,” Advances in Neural Information Processing Systems, vol. 34, pp. 13 266–13 279, 2021.
- S. Mahdavi, K. Swersky, T. Kipf, M. Hashemi, C. Thrampoulidis, and R. Liao, “Towards better out-of-distribution generalization of neural algorithmic reasoning tasks,” arXiv preprint arXiv:2211.00692, 2022.
- S. Akansha, “Addressing the impact of localized training data in graph neural networks,” arXiv preprint arXiv:2307.12689, 2023.
- D. Bo, B. Hu, X. Wang, Z. Zhang, C. Shi, and J. Zhou, “Regularizing graph neural networks via consistency-diversity graph augmentations,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 4, 2022, pp. 3913–3921.
- U. Alon and E. Yahav, “On the bottleneck of graph neural networks and its practical implications,” arXiv preprint arXiv:2006.05205, 2020.
- J. H. Giraldo, F. D. Malliaros, and T. Bouwmans, “Understanding the relationship between over-smoothing and over-squashing in graph neural networks,” arXiv preprint arXiv:2212.02374, 2022.
- F. Di Giovanni, L. Giusti, F. Barbero, G. Luise, P. Lio, and M. M. Bronstein, “On over-squashing in message passing neural networks: The impact of width, depth, and topology,” in International Conference on Machine Learning. PMLR, 2023, pp. 7865–7885.
- J. Topping, F. Di Giovanni, B. P. Chamberlain, X. Dong, and M. M. Bronstein, “Understanding over-squashing and bottlenecks on graphs via curvature,” arXiv preprint arXiv:2111.14522, 2021.
- Y. Zhang and Q. Yao, “Knowledge graph reasoning with relational digraph,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 912–924.
- M. Black, Z. Wan, A. Nayyeri, and Y. Wang, “Understanding oversquashing in gnns through the lens of effective resistance,” in International Conference on Machine Learning. PMLR, 2023, pp. 2528–2547.
- A. K. Chandra, P. Raghavan, W. L. Ruzzo, and R. Smolensky, “The electrical resistance of a graph captures its commute and cover times,” in Proceedings of the twenty-first annual ACM symposium on Theory of computing, 1989, pp. 574–586.
- F. Di Giovanni, T. K. Rusch, M. M. Bronstein, A. Deac, M. Lackenby, S. Mishra, and P. Veličković, “How does over-squashing affect the power of gnns?” arXiv preprint arXiv:2306.03589, 2023.
- K. Nguyen, N. M. Hieu, V. D. Nguyen, N. Ho, S. Osher, and T. M. Nguyen, “Revisiting over-smoothing and over-squashing using ollivier-ricci curvature,” in International Conference on Machine Learning. PMLR, 2023, pp. 25 956–25 979.
- Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec, “Hierarchical graph representation learning with differentiable pooling,” Advances in neural information processing systems, vol. 31, 2018.
- E. Luzhnica, B. Day, and P. Lio, “Clique pooling for graph classification,” arXiv preprint arXiv:1904.00374, 2019.
- C. Sanders, A. Roth, and T. Liebig, “Curvature-based pooling within graph neural networks,” arXiv preprint arXiv:2308.16516, 2023.
- K. Karhadkar, P. K. Banerjee, and G. Montúfar, “Fosr: First-order spectral rewiring for addressing oversquashing in gnns,” arXiv preprint arXiv:2210.11790, 2022.
- Y. Liu, C. Zhou, S. Pan, J. Wu, Z. Li, H. Chen, and P. Zhang, “Curvdrop: A ricci curvature based approach to prevent graph neural networks from over-smoothing and over-squashing,” in Proceedings of the ACM Web Conference 2023, 2023, pp. 221–230.
- P. K. Banerjee, K. Karhadkar, Y. G. Wang, U. Alon, and G. Montúfar, “Oversquashing in gnns through the lens of information contraction and graph expansion,” in 2022 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2022, pp. 1–8.
- A. Deac, M. Lackenby, and P. Veličković, “Expander graph propagation,” in Learning on Graphs Conference. PMLR, 2022, pp. 38–1.
- C. Cai, T. S. Hy, R. Yu, and Y. Wang, “On the connection between mpnn and graph transformer,” arXiv preprint arXiv:2301.11956, 2023.
- J. Gasteiger, S. Weißenberger, and S. Günnemann, “Diffusion improves graph learning,” Advances in neural information processing systems, vol. 32, 2019.
- G. Wang, R. Ying, J. Huang, and J. Leskovec, “Multi-hop attention graph neural network,” arXiv preprint arXiv:2009.14332, 2020.
- Y. Lin, L. Lu, and S.-T. Yau, “Ricci curvature of graphs,” Tohoku Mathematical Journal, Second Series, vol. 63, no. 4, pp. 605–627, 2011.
- R. B. Gabrielsson, M. Yurochkin, and J. Solomon, “Rewiring with positional encodings for gnns,” 2022.
- L. Rampášek, M. Galkin, V. P. Dwivedi, A. T. Luu, G. Wolf, and D. Beaini, “Recipe for a general, powerful, scalable graph transformer,” Advances in Neural Information Processing Systems, vol. 35, pp. 14 501–14 515, 2022.
- C. Qian, A. Manolache, K. Ahmed, Z. Zeng, G. V. d. Broeck, M. Niepert, and C. Morris, “Probabilistically rewired message-passing neural networks,” arXiv preprint arXiv:2310.02156, 2023.
- K. Ahmed, Z. Zeng, M. Niepert, and G. V. d. Broeck, “Simple: A gradient estimator for k𝑘kitalic_k-subset sampling,” arXiv preprint arXiv:2210.01941, 2022.
- B. Gutteridge, X. Dong, M. M. Bronstein, and F. Di Giovanni, “Drew: Dynamically rewired message passing with delay,” in International Conference on Machine Learning. PMLR, 2023, pp. 12 252–12 267.
- R. Abboud, R. Dimitrov, and I. I. Ceylan, “Shortest path networks for graph property prediction,” in Learning on Graphs Conference. PMLR, 2022, pp. 5–1.
- D. Tortorella and A. Micheli, “Leave graphs alone: Addressing over-squashing without rewiring,” arXiv preprint arXiv:2212.06538, 2022.
- A. Arnaiz-Rodríguez, A. Begga, F. Escolano, and N. Oliver, “Diffwire: Inductive graph rewiring via the lov\\\backslash\’asz bound,” arXiv preprint arXiv:2206.07369, 2022.
- D. Beaini, S. Passaro, V. Létourneau, W. Hamilton, G. Corso, and P. Liò, “Directional graph networks,” in International Conference on Machine Learning. PMLR, 2021, pp. 748–758.
- Z. Shao, D. Shi, A. Han, Y. Guo, Q. Zhao, and J. Gao, “Unifying over-smoothing and over-squashing in graph neural networks: A physics informed approach and beyond,” arXiv preprint arXiv:2309.02769, 2023.
- A. Gravina, D. Bacciu, and C. Gallicchio, “Anti-symmetric dgn: A stable architecture for deep graph networks,” arXiv preprint arXiv:2210.09789, 2022.
- Q. Sun, J. Li, H. Yuan, X. Fu, H. Peng, C. Ji, Q. Li, and P. S. Yu, “Position-aware structure learning for graph topology-imbalance by relieving under-reaching and over-squashing,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 1848–1857.
- R. Chen, S. Zhang, Y. Li et al., “Redundancy-free message passing for graph neural networks,” Advances in Neural Information Processing Systems, vol. 35, pp. 4316–4327, 2022.
- N. Alon, “Eigenvalues and expanders,” Combinatorica, vol. 6, no. 2, pp. 83–96, 1986.
- A. Selberg, “On the estimation of fourier coefficients of modular forms,” in Proceedings of Symposia in Pure Mathematics. American Mathematical Society, 1965, pp. 1–15.
- T. Bühler and M. Hein, “Spectral clustering based on the graph p-laplacian,” in Proceedings of the 26th annual international conference on machine learning, 2009, pp. 81–88.
- Z. Allen-Zhu, A. Bhaskara, S. Lattanzi, V. Mirrokni, and L. Orecchia, “Expanders via local edge flips,” in Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms. SIAM, 2016, pp. 259–269.
- T. Feder, A. Guetz, M. Mihail, and A. Saberi, “A local switch markov chain on given degree graphs with application in connectivity of peer-to-peer networks,” in 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06). IEEE, 2006, pp. 69–76.
- C. Cooper, M. Dyer, C. Greenhill, and A. Handley, “The flip markov chain for connected regular graphs,” Discrete Applied Mathematics, vol. 254, pp. 56–79, 2019.
- Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” in Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018.
- N. Hoang, T. Maehara, and T. Murata, “Revisiting graph neural networks: Graph filtering perspective,” in 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021, pp. 8376–8383.
- T. K. Rusch, B. Chamberlain, J. Rowbottom, S. Mishra, and M. Bronstein, “Graph-coupled oscillator networks,” in International Conference on Machine Learning. PMLR, 2022, pp. 18 888–18 909.
- D. Shi, Y. Guo, Z. Shao, and J. Gao, “How curvature enhance the adaptation power of framelet gcns,” arXiv preprint arXiv:2307.09768, 2023.
- S. Yun, M. Jeong, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks,” Advances in neural information processing systems, vol. 32, 2019.
- C. Chen, Y. Wu, Q. Dai, H.-Y. Zhou, M. Xu, S. Yang, X. Han, and Y. Yu, “A survey on graph neural networks and graph transformers in computer vision: a task-oriented perspective,” arXiv preprint arXiv:2209.13232, 2022.
- D. Cai and W. Lam, “Graph transformer for graph-to-sequence learning,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 05, 2020, pp. 7464–7471.
- C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu, “Do transformers really perform badly for graph representation?” Advances in Neural Information Processing Systems, vol. 34, pp. 28 877–28 888, 2021.
- D. Kreuzer, D. Beaini, W. Hamilton, V. Létourneau, and P. Tossou, “Rethinking graph transformers with spectral attention,” Advances in Neural Information Processing Systems, vol. 34, pp. 21 618–21 629, 2021.
- X. He, B. Hooi, T. Laurent, A. Perold, Y. LeCun, and X. Bresson, “A generalization of vit/mlp-mixer to graphs,” in International Conference on Machine Learning. PMLR, 2023, pp. 12 724–12 745.
- Y. Ollivier, “Ricci curvature of markov chains on metric spaces,” Journal of Functional Analysis, vol. 256, no. 3, pp. 810–864, 2009.
- K. Nakajima, “Reservoir computing: Theory, physical implementations, and applications,” IEICE Technical Report; IEICE Tech. Rep., vol. 118, no. 220, pp. 149–154, 2018.
- H. Pei, B. Wei, K. C.-C. Chang, Y. Lei, and B. Yang, “Geom-gcn: Geometric graph convolutional networks,” arXiv preprint arXiv:2002.05287, 2020.
- A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, “Automating the construction of internet portals with machine learning,” Information Retrieval, vol. 3, pp. 127–163, 2000.
- P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, “Collective classification in network data,” AI magazine, vol. 29, no. 3, pp. 93–93, 2008.
- G. Namata, B. London, L. Getoor, B. Huang, and U. Edu, “Query-driven active surveying for collective classification,” in 10th international workshop on mining and learning with graphs, vol. 8, 2012, p. 1.
- B. Rozemberczki, C. Allen, and R. Sarkar, “Multi-scale attributed node embedding,” Journal of Complex Networks, vol. 9, no. 2, p. cnab014, 2021.
- J. Tang, J. Sun, C. Wang, and Z. Yang, “Social influence analysis in large-scale networks,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009, pp. 807–816.
- C. Morris, N. M. Kriege, F. Bause, K. Kersting, P. Mutzel, and M. Neumann, “Tudataset: A collection of benchmark datasets for learning with graphs,” arXiv preprint arXiv:2007.08663, 2020.