Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 72 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 43 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 107 tok/s Pro
Kimi K2 219 tok/s Pro
GPT OSS 120B 465 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

An Empirical Study of Automated Vulnerability Localization with Large Language Models (2404.00287v1)

Published 30 Mar 2024 in cs.SE and cs.CR

Abstract: Recently, Automated Vulnerability Localization (AVL) has attracted much attention, aiming to facilitate diagnosis by pinpointing the lines of code responsible for discovered vulnerabilities. LLMs have shown potential in various domains, yet their effectiveness in vulnerability localization remains underexplored. In this work, we perform the first comprehensive study of LLMs for AVL. Our investigation encompasses 10+ leading LLMs suitable for code analysis, including ChatGPT and various open-source models, across three architectural types: encoder-only, encoder-decoder, and decoder-only, with model sizes ranging from 60M to 16B parameters. We explore the efficacy of these LLMs using 4 distinct paradigms: zero-shot learning, one-shot learning, discriminative fine-tuning, and generative fine-tuning. Our evaluation framework is applied to the BigVul-based dataset for C/C++, and an additional dataset comprising smart contract vulnerabilities. The results demonstrate that discriminative fine-tuning of LLMs can significantly outperform existing learning-based methods for AVL, while other paradigms prove less effective or unexpectedly ineffective for the task. We also identify challenges related to input length and unidirectional context in fine-tuning processes for encoders and decoders. We then introduce two remedial strategies: the sliding window and the right-forward embedding, both of which substantially enhance performance. Furthermore, our findings highlight certain generalization capabilities of LLMs across Common Weakness Enumerations (CWEs) and different projects, indicating a promising pathway toward their practical application in vulnerability localization.

Citations (15)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.