Boosting Static Resource Leak Detection via LLM-based Resource-Oriented Intention Inference (2311.04448v4)

Published 8 Nov 2023 in cs.SE

Abstract: Resource leaks, caused by resources not being released after acquisition, often lead to performance issues and system crashes. Existing static detection techniques rely on mechanical matching of predefined resource acquisition/release APIs and null-checking conditions to find unreleased resources, suffering from both (1) false negatives caused by the incompleteness of predefined resource acquisition/release APIs and (2) false positives caused by the incompleteness of resource reachability validation identification. To overcome these challenges, we propose InferROI, a novel approach that leverages the exceptional code comprehension capability of LLMs to directly infer resource-oriented intentions (acquisition, release, and reachability validation) in code. InferROI first prompts the LLM to infer involved intentions for a given code snippet, and then incorporates a two-stage static analysis approach to check control-flow paths for resource leak detection based on the inferred intentions. We evaluate the effectiveness of InferROI in both resource-oriented intention inference and resource leak detection. Experimental results on the DroidLeaks and JLeaks datasets demonstrate InferROI achieves promising bug detection rate (59.3% and 62.5%) and false alarm rate (18.6% and 19.5%). Compared to three industrial static detectors, InferROI detects 14~45 and 149~485 more bugs in DroidLeaks and JLeaks, respectively. When applied to real-world open-source projects, InferROI identifies 29 unknown resource leak bugs (verified by authors), with 7 of them being confirmed by developers. In addition, the results of an ablation study underscores the importance of combining LLM-based inference with static analysis.

References (39)

Citations (6)

View on Semantic Scholar

Summary

The paper introduces InferROI, leveraging LLMs to infer resource-oriented intentions for detecting static resource leaks with improved accuracy.
It employs a two-stage detection process that first identifies leak-risk paths and then prunes false positives through reachability validations.
Empirical evaluations demonstrate detection rates of 59.3% and 64.8%, with 74.6% precision and 81.8% recall, uncovering 26 new leaks in open-source projects.

Inferring Resource-Oriented Intentions using LLMs for Static Resource Leak Detection

The paper under review presents InferROI, a novel approach leveraging LLMs for static resource leak detection by inferring resource-oriented intentions directly from code. It introduces a two-stage detection process that significantly improves bug detection rates and reduces false alarms.

Overview

Resource leaks, resulting from acquired resources not being appropriately released, have long been recognized as critical software defects leading to performance degradation and system failures. Traditional static analysis methods for resource leak detection are often hindered by dependency on predefined API pairs and mechanical matching techniques, which lead to false negatives and positives due to incomplete identification of acquisition/release APIs and reachability validation conditions.

InferROI brings a fresh perspective by employing the advanced code comprehension capabilities of LLMs to infer three distinct resource-oriented intentions in code: resource acquisition, release, and reachability validation. This inference does not depend on prior knowledge of resource-specific APIs, making it more adaptable and expansive in detecting diverse resource types.

Methodology

The approach outlined in the paper consists of several key components:

Resource-Oriented Intention Inference: The paper employs prompting strategies tailored for GPT-4, instructing it to discern resource-related intentions from given code snippets. The LLM analyzes the syntax and semantics of the code to infer potential acquisition, release, and validation intentions. The extracted intentions are then formalized into expressions suitable for subsequent analysis.
Lightweight Static Analysis: Once intentions are inferred, InferROI applies a two-stage path analysis to detect resource leaks effectively. The first stage identifies potential leak-risky paths based on the inferred acquisition and release intentions. The second stage prunes these paths by assessing resource reachability validation, thus reducing false positives.
Application and Evaluation: In evaluations using the DroidLeaks and JLeaks datasets, InferROI demonstrated high bug detection rates (59.3% and 64.8%, respectively) with reasonable false alarm rates. The comparisons with established static analysis tools like SpotBugs, Infer, and PMD highlighted clear improvements. Additionally, InferROI successfully identified 26 new resource leaks in real-world open-source projects, underscoring its practical utility.

Findings and Implications

InferROI showcases a significant stride forward in resource leak detection, elucidating the potential of LLMs in static analysis domains. The empirical results, complemented by a precision of 74.6% and a recall of 81.8% in intention inference, highlight its efficacy in identifying diverse resource types across different codebases.

Broader Coverage of Resource Types: By decoupling from the constraints of predefined API pairs, InferROI achieves broader resource type coverage, outperforming traditional static analyzers which often miss less common or newly introduced resource types.
Scalability and Flexibility: The integration of LLMs encompasses a wide range of potential use cases, allowing for scalable applications in various programming environments without extensive manual configuration or predefined knowledge.
Complementary to Existing Techniques: While showcasing independent effectiveness, InferROI's approach can complement more rigorous program analysis techniques, opening avenues for hybrid detection strategies that leverage both LLM-based inference and sound static analysis.

Future Directions

The findings from this paper open up several research opportunities. Future work could extend this approach to additional programming languages and integrate more sophisticated program analysis methodologies to address scenarios such as Android’s complex lifecycle management. Additionally, advancements in LLM technology and fine-tuning could further empower the inference capabilities, bridging the gap between syntactic comprehension and deeper semantic understanding.

Conclusion

InferROI embodies a promising advancement in static resource leak detection, offering a robust framework that effectively incorporates LLM-based resource-oriented intention inference. This work emphasizes the important role of AI in enhancing static analysis tools, providing a pathway to more intelligent and adaptive defect detection methodologies.

PDF Markdown

Related Papers

Tweets

https://twitter.com/yiling__LOU/status/1853477791926280238