Repository-Level Prompt Generation for Large Language Models of Code (2206.12839v3)
Abstract: With the success of LLMs of code and their use as code assistants (e.g. Codex used in GitHub Copilot), techniques for introducing domain-specific knowledge in the prompt design process become important. In this work, we propose a framework called Repo-Level Prompt Generator that learns to generate example-specific prompts using prompt proposals. The prompt proposals take context from the entire repository, thereby incorporating both the structure of the repository and the context from other relevant files (e.g. imports, parent class files). Our technique doesn't require any access to the weights of the LLM, making it applicable in cases where we only have black-box access to the LLM. We conduct experiments on the task of single-line code-autocompletion using code repositories taken from Google Code archives. We demonstrate that an oracle constructed from our prompt proposals gives a remarkably high relative improvement of 36% over Codex, showing the quality of these proposals. Further, we show that when we train a model to predict a prompt proposal, we can achieve significant performance gains over Codex and other baselines. We release our code, data, and trained checkpoints at: \url{https://github.com/shrivastavadisha/repo_level_prompt_generation}.
- Allamanis, M. The adverse effects of code duplication in machine learning models of code. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. Association for Computing Machinery, 2019.
- Program synthesis with large language models. arXiv preprint arXiv:2108.07732, 2021.
- Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
- GPT-NeoX-20B: An open-source autoregressive language model. In Proceedings of the ACL Workshop on Challenges & Perspectives in Creating Large Language Models, 2022.
- FLEX: Unifying evaluation for few-shot NLP. In Advances in Neural Information Processing Systems, 2021.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, 2020.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
- Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155, 2020.
- Incoder: A generative model for code infilling and synthesis. arXiv preprint arXiv:2204.05999, 2022.
- Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021.
- Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:2009.08366, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
- Are deep neural networks the best choice for modeling source code? In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017.
- Discovering the syntax and strategies of natural language programming with generative language models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022.
- A probabilistic model of information retrieval: development and comparative experiments - part 1. Inf. Process. Manag., 2000.
- Learning and evaluating contextual embedding of source code. In Proceedings of the 37th International Conference on Machine Learning, 2020.
- Generalization through memorization: Nearest neighbor language models. In International Conference on Learning Representations, 2020.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Large language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916, 2022.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021.
- Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021.
- Competition-level code generation with alphacode. Science, 2022.
- What makes good in-context examples for GPT-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, 2022.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023.
- Gpt understands, too. arXiv:2103.10385, 2021.
- Embedding api dependency graph for neural code generation. Empirical Softw. Engg., 2021.
- Learning to walk over relational graphs of source code. In Deep Learning for Code Workshop, 2022a.
- Codetrek: Flexible modeling of code using an extensible relational representation. In International Conference on Learning Representations, 2022b.
- Do users write more insecure code with ai assistants? arXiv preprint arXiv:2211.03622, 2022.
- Learning how to ask: Querying LMs with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- A generalist agent. arXiv preprint arXiv:2205.06175, 2022.
- Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 2021.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021.
- AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. In Empirical Methods in Natural Language Processing (EMNLP), 2020.
- On-the-fly adaptation of source code models. In NeurIPS 2020 Workshop on Computer-Assisted Programming, 2020.
- Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 2014.
- Investigating explainability of generative ai for code through scenario-based design. In 27th International Conference on Intelligent User Interfaces, 2022.
- Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017.
- GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax, 2021.
- No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022.
- Cocosum: Contextual code summarization with multi-relational graph neural network. arXiv preprint arXiv:2107.01933, 2021a.
- Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021b.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022.
- Memorizing transformers. In International Conference on Learning Representations, 2022.
- PRIMERA: Pyramid-based masked sentence pre-training for multi-document summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022.
- A systematic evaluation of large language models of code. arXiv preprint arXiv:2202.13169, 2022a.
- Capturing structural locality in non-parametric language models. In International Conference on Learning Representations, 2022b.
- Learning to generate code comments from class hierarchies. arXiv preprint arXiv:2103.13426, 2021.
- Docprompting: Generating code by retrieving the docs. In International Conference on Learning Representations, 2023.