CSA-Trans: Code Structure Aware Transformer for AST (2404.05767v1)
Abstract: When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems NIPS, 2017.
- A. Dosovitskiy, L. Beyer, and A. K. et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in 9th International Conference on Learning Representations, ICLR, 2021.
- T. B. Brown, B. Mann, and N. R. et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems NeurIPS, 2020.
- A. Nambiar, M. Heflin, S. Liu, S. Maslov, M. Hopkins, and A. M. Ritz, “Transforming the language of life: Transformer neural networks for protein prediction tasks,” in International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB, 2020.
- P. Shaw, J. Uszkoreit, and A. Vaswani, “Self-attention with relative position representations,” in North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, 2018.
- X. Chu, B. Zhang, Z. Tian, X. Wei, and H. Xia, “Do we really need explicit position encodings for vision transformers?” 2021.
- V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, and X. Bresson, “Benchmarking graph neural networks,” 2020.
- V. P. Dwivedi, A. T. Luu, T. Laurent, Y. Bengio, and X. Bresson, “Graph neural networks with learnable structural and positional representations,” in The Tenth International Conference on Learning Representations, ICLR, 2022.
- R. Minelli, A. Mocci, and M. Lanza, “I know what you did last summer - an investigation of how developers spend their time,” in IEEE 23rd International Conference on Program Comprehension (ICPC), 2015.
- X. Xia, L. Bao, D. Lo, Z. Xing, A. E. Hassan, and S. Li, “Measuring program comprehension: A large-scale field study with professionals,” 2018.
- X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, “Deep code comment generation,” in IEEE/ACM 26th International Conference on Program Comprehension (ICPC), 2018.
- P. W. McBurney and C. McMillan, “Automatic documentation generation via source code summarization of method context,” in 22nd International Conference on Program Comprehension, ICPC 2014, Hyderabad, India, June 2-3, 2014, 2014.
- S. Haiduc, J. Aponte, and A. Marcus, “Supporting program comprehension with source code summarization,” in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2, ICSE 2010, Cape Town, South Africa, 1-8 May 2010, 2010.
- Z. Tang, X. Shen, C. Li, J. Ge, L. Huang, Z. Zhu, and B. Luo, “Ast-trans: Code summarization with efficient tree-structured attention,” in IEEE/ACM 44th International Conference on Software Engineering (ICSE), 2022.
- S. Gao, C. Gao, Y. He, J. Zeng, L. Y. Nie, and X. Xia, “Code structure guided transformer for source code summarization,” ACM Transactions on Software Engineering and Methodology, 2021.
- J. Guo, J. Liu, Y. Wan, L. Li, and P. Zhou, “Modeling hierarchical syntax structure with triplet position for source code summarization,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, 2022.
- S. Cho, S. Min, J. Kim, M. Lee, H. Lee, and S. Hong, “Transformers meet stochastic block models: Attention with data-adaptive sparsity and cost,” in 35th Conference on Neural Information Processing Systems NeurIPS, 2022.
- V. L. Shiv and C. Quirk, “Novel positional encodings to enable tree-based transformers,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems NeurIPS, 2019.
- S. Iyer, I. Konstas, A. Cheung, and L. Zettlemoyer, “Summarizing source code using a neural attention model,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, 2016.
- A. Eriguchi, K. Hashimoto, and Y. Tsuruoka, “Tree-to-sequence attentional neural machine translation,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, 2016.
- H. Wang, H. Yin, M. Zhang, and P. Li, “Equivariant and stable positional encoding for more powerful graph neural networks,” in The Tenth International Conference on Learning Representations, ICLR, 2022.
- P. He, X. Liu, J. Gao, and W. Chen, “Deberta: decoding-enhanced bert with disentangled attention,” in 9th International Conference on Learning Representations, ICLR, 2021.
- X. Hu, G. Li, X. Xia, D. Lo, S. Lu, and Z. Jin, “Summarizing source code with transferred API knowledge,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2018.
- Y. Wan, Z. Zhao, M. Yang, G. Xu, H. Ying, J. Wu, and P. S. Yu, “Improving automatic source code summarization via deep reinforcement learning,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE, 2018.
- K. Papineni, S. Roukos, T. Ward, and W. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL, 2002.
- S. Jiang, A. Armaly, and C. McMillan, “Automatically generating commit messages from diffs using neural machine translation,” in Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, ASE, 2017.
- J. Zhang, M. Utiyama, E. Sumita, G. Neubig, and S. Nakamura, “Guiding neural machine translation with retrieved translation pieces,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, 2018.
- S. Banerjee and A. Lavie, “METEOR: an automatic metric for MT evaluation with improved correlation with human judgments,” in Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization@ACL, 2005.
- C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” 2004.
- B. Wei, G. Li, X. Xia, Z. Fu, and Z. Jin, “Code generation as a dual task of code summarization,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems NeurIPS, 2019.
- W. U. Ahmad, S. Chakraborty, B. Ray, and K. Chang, “A transformer-based approach for source code summarization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, 2020.
- U. Alon, S. Brody, O. Levy, and E. Yahav, “code2seq: Generating sequences from structured representations of code,” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
- Y. Choi, J. Bak, C. Na, and J. Lee, “Learning sequential and structural information for source code summarization,” in Findings of the Association for Computational Linguistics: ACL/IJCNLP, 2021.
- V. J. Hellendoorn, C. Sutton, R. Singh, P. Maniatis, and D. Bieber, “Global relational models of source code,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
- Y. Wang, W. Wang, S. R. Joty, and S. C. H. Hoi, “Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” 2021.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in 7th International Conference on Learning Representations, ICLR, 2019.
- H. Peng, G. Li, W. Wang, Y. Zhao, and Z. Jin, “Integrating tree path in transformer for code representation,” in 35th Conference on Neural Information Processing Systems NeurIPS 2021, 2021.
- S. Liu, Y. Chen, X. Xie, J. K. Siow, and Y. Liu, “Retrieval-augmented generation for code summarization via hybrid GNN,” in 9th International Conference on Learning Representations ICLR, 2021.
- J. Qiu, J. Tang, H. Ma, Y. Dong, K. Wang, and J. Tang, “DeepInf: Social Influence Prediction with Deep Learning,” in International Conference on Knowledge Discovery Data Mining SIGKDD, 2018.
- S. Wu, Y. Tang, Y. Zhu, L. Wang, X. Xie, and T. Tan, “Session-based recommendation with graph neural networks,” 2019.
- J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International Conference on Machine Learning, ICML, 2017.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in 5th International Conference on Learning Representations, ICLR, 2017.
- P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in 6th International Conference on Learning Representations, ICLR, 2018.
- C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu, “Do transformers really perform bad for graph representation?” in 35th Conference on Neural Information Processing Systems NeurIPS, 2021.
- H. Maron, H. Ben-Hamu, N. Shamir, and Y. Lipman, “Invariant and equivariant graph networks,” in 7th International Conference on Learning Representations, ICLR, 2019.
- J. Kim, S. Oh, and S. Hong, “Transformers generalize deepsets and can be extended to graphs & hypergraphs,” in Annual Conference on Neural Information Processing Systems, NeurIPS, 2021.
- J. Kim, T. D. Nguyen, S. Min, S. Cho, M. Lee, H. Lee, and S. Hong, “Pure transformers are powerful graph learners,” in Annual Conference on Neural Information Processing Systems, NeurIPS, 2022.
- Saeyoon Oh (4 papers)
- Shin Yoo (49 papers)