Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CSA-Trans: Code Structure Aware Transformer for AST (2404.05767v1)

Published 7 Apr 2024 in cs.SE and cs.AI

Abstract: When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems NIPS, 2017.
  2. A. Dosovitskiy, L. Beyer, and A. K. et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in 9th International Conference on Learning Representations, ICLR, 2021.
  3. T. B. Brown, B. Mann, and N. R. et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems NeurIPS, 2020.
  4. A. Nambiar, M. Heflin, S. Liu, S. Maslov, M. Hopkins, and A. M. Ritz, “Transforming the language of life: Transformer neural networks for protein prediction tasks,” in International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB, 2020.
  5. P. Shaw, J. Uszkoreit, and A. Vaswani, “Self-attention with relative position representations,” in North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, 2018.
  6. X. Chu, B. Zhang, Z. Tian, X. Wei, and H. Xia, “Do we really need explicit position encodings for vision transformers?” 2021.
  7. V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, and X. Bresson, “Benchmarking graph neural networks,” 2020.
  8. V. P. Dwivedi, A. T. Luu, T. Laurent, Y. Bengio, and X. Bresson, “Graph neural networks with learnable structural and positional representations,” in The Tenth International Conference on Learning Representations, ICLR, 2022.
  9. R. Minelli, A. Mocci, and M. Lanza, “I know what you did last summer - an investigation of how developers spend their time,” in IEEE 23rd International Conference on Program Comprehension (ICPC), 2015.
  10. X. Xia, L. Bao, D. Lo, Z. Xing, A. E. Hassan, and S. Li, “Measuring program comprehension: A large-scale field study with professionals,” 2018.
  11. X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, “Deep code comment generation,” in IEEE/ACM 26th International Conference on Program Comprehension (ICPC), 2018.
  12. P. W. McBurney and C. McMillan, “Automatic documentation generation via source code summarization of method context,” in 22nd International Conference on Program Comprehension, ICPC 2014, Hyderabad, India, June 2-3, 2014, 2014.
  13. S. Haiduc, J. Aponte, and A. Marcus, “Supporting program comprehension with source code summarization,” in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2, ICSE 2010, Cape Town, South Africa, 1-8 May 2010, 2010.
  14. Z. Tang, X. Shen, C. Li, J. Ge, L. Huang, Z. Zhu, and B. Luo, “Ast-trans: Code summarization with efficient tree-structured attention,” in IEEE/ACM 44th International Conference on Software Engineering (ICSE), 2022.
  15. S. Gao, C. Gao, Y. He, J. Zeng, L. Y. Nie, and X. Xia, “Code structure guided transformer for source code summarization,” ACM Transactions on Software Engineering and Methodology, 2021.
  16. J. Guo, J. Liu, Y. Wan, L. Li, and P. Zhou, “Modeling hierarchical syntax structure with triplet position for source code summarization,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, 2022.
  17. S. Cho, S. Min, J. Kim, M. Lee, H. Lee, and S. Hong, “Transformers meet stochastic block models: Attention with data-adaptive sparsity and cost,” in 35th Conference on Neural Information Processing Systems NeurIPS, 2022.
  18. V. L. Shiv and C. Quirk, “Novel positional encodings to enable tree-based transformers,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems NeurIPS, 2019.
  19. S. Iyer, I. Konstas, A. Cheung, and L. Zettlemoyer, “Summarizing source code using a neural attention model,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, 2016.
  20. A. Eriguchi, K. Hashimoto, and Y. Tsuruoka, “Tree-to-sequence attentional neural machine translation,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, 2016.
  21. H. Wang, H. Yin, M. Zhang, and P. Li, “Equivariant and stable positional encoding for more powerful graph neural networks,” in The Tenth International Conference on Learning Representations, ICLR, 2022.
  22. P. He, X. Liu, J. Gao, and W. Chen, “Deberta: decoding-enhanced bert with disentangled attention,” in 9th International Conference on Learning Representations, ICLR, 2021.
  23. X. Hu, G. Li, X. Xia, D. Lo, S. Lu, and Z. Jin, “Summarizing source code with transferred API knowledge,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2018.
  24. Y. Wan, Z. Zhao, M. Yang, G. Xu, H. Ying, J. Wu, and P. S. Yu, “Improving automatic source code summarization via deep reinforcement learning,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE, 2018.
  25. K. Papineni, S. Roukos, T. Ward, and W. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL, 2002.
  26. S. Jiang, A. Armaly, and C. McMillan, “Automatically generating commit messages from diffs using neural machine translation,” in Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, ASE, 2017.
  27. J. Zhang, M. Utiyama, E. Sumita, G. Neubig, and S. Nakamura, “Guiding neural machine translation with retrieved translation pieces,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, 2018.
  28. S. Banerjee and A. Lavie, “METEOR: an automatic metric for MT evaluation with improved correlation with human judgments,” in Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization@ACL, 2005.
  29. C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” 2004.
  30. B. Wei, G. Li, X. Xia, Z. Fu, and Z. Jin, “Code generation as a dual task of code summarization,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems NeurIPS, 2019.
  31. W. U. Ahmad, S. Chakraborty, B. Ray, and K. Chang, “A transformer-based approach for source code summarization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, 2020.
  32. U. Alon, S. Brody, O. Levy, and E. Yahav, “code2seq: Generating sequences from structured representations of code,” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
  33. Y. Choi, J. Bak, C. Na, and J. Lee, “Learning sequential and structural information for source code summarization,” in Findings of the Association for Computational Linguistics: ACL/IJCNLP, 2021.
  34. V. J. Hellendoorn, C. Sutton, R. Singh, P. Maniatis, and D. Bieber, “Global relational models of source code,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
  35. Y. Wang, W. Wang, S. R. Joty, and S. C. H. Hoi, “Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” 2021.
  36. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in 7th International Conference on Learning Representations, ICLR, 2019.
  37. H. Peng, G. Li, W. Wang, Y. Zhao, and Z. Jin, “Integrating tree path in transformer for code representation,” in 35th Conference on Neural Information Processing Systems NeurIPS 2021, 2021.
  38. S. Liu, Y. Chen, X. Xie, J. K. Siow, and Y. Liu, “Retrieval-augmented generation for code summarization via hybrid GNN,” in 9th International Conference on Learning Representations ICLR, 2021.
  39. J. Qiu, J. Tang, H. Ma, Y. Dong, K. Wang, and J. Tang, “DeepInf: Social Influence Prediction with Deep Learning,” in International Conference on Knowledge Discovery Data Mining SIGKDD, 2018.
  40. S. Wu, Y. Tang, Y. Zhu, L. Wang, X. Xie, and T. Tan, “Session-based recommendation with graph neural networks,” 2019.
  41. J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International Conference on Machine Learning, ICML, 2017.
  42. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in 5th International Conference on Learning Representations, ICLR, 2017.
  43. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in 6th International Conference on Learning Representations, ICLR, 2018.
  44. C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu, “Do transformers really perform bad for graph representation?” in 35th Conference on Neural Information Processing Systems NeurIPS, 2021.
  45. H. Maron, H. Ben-Hamu, N. Shamir, and Y. Lipman, “Invariant and equivariant graph networks,” in 7th International Conference on Learning Representations, ICLR, 2019.
  46. J. Kim, S. Oh, and S. Hong, “Transformers generalize deepsets and can be extended to graphs & hypergraphs,” in Annual Conference on Neural Information Processing Systems, NeurIPS, 2021.
  47. J. Kim, T. D. Nguyen, S. Min, S. Cho, M. Lee, H. Lee, and S. Hong, “Pure transformers are powerful graph learners,” in Annual Conference on Neural Information Processing Systems, NeurIPS, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Saeyoon Oh (4 papers)
  2. Shin Yoo (49 papers)

Summary

We haven't generated a summary for this paper yet.