Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Superion: Grammar-Aware Greybox Fuzzing (1812.01197v3)

Published 4 Dec 2018 in cs.CR and cs.SE

Abstract: In recent years, coverage-based greybox fuzzing has proven itself to be one of the most effective techniques for finding security bugs in practice. Particularly, American Fuzzy Lop (AFL for short) is deemed to be a great success in fuzzing relatively simple test inputs. Unfortunately, when it meets structured test inputs such as XML and JavaScript, those grammar-blind trimming and mutation strategies in AFL hinder the effectiveness and efficiency. To this end, we propose a grammar-aware coverage-based greybox fuzzing approach to fuzz programs that process structured inputs. Given the grammar (which is often publicly available) of test inputs, we introduce a grammar-aware trimming strategy to trim test inputs at the tree level using the abstract syntax trees (ASTs) of parsed test inputs. Further, we introduce two grammar-aware mutation strategies (i.e., enhanced dictionary-based mutation and tree-based mutation). Specifically, tree-based mutation works via replacing subtrees using the ASTs of parsed test inputs. Equipped with grammar-awareness, our approach can carry the fuzzing exploration into width and depth. We implemented our approach as an extension to AFL, named Superion; and evaluated the effectiveness of Superion on real-life large-scale programs (a XML engine libplist and three JavaScript engines WebKit, Jerryscript and ChakraCore). Our results have demonstrated that Superion can improve the code coverage (i.e., 16.7% and 8.8% in line and function coverage) and bug-finding capability (i.e., 31 new bugs, among which we discovered 21 new vulnerabilities with 16 CVEs assigned and 3.2K USD bug bounty rewards received) over AFL and jsfunfuzz. We also demonstrated the effectiveness of our grammar-aware trimming and mutation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Junjie Wang (164 papers)
  2. Bihuan Chen (21 papers)
  3. Lei Wei (33 papers)
  4. Yang Liu (2256 papers)
Citations (225)

Summary

  • The paper introduces a grammar-aware fuzzing approach that integrates AST-based trimming and mutation to preserve input structure and improve bug detection.
  • It achieves a 16.7% increase in line coverage and an 8.8% boost in function coverage, uncovering 31 new bugs including 21 previously undetected vulnerabilities.
  • The study highlights the potential for grammar-aware techniques to broaden fuzz testing effectiveness and encourages further research in automatic grammar inference.

Superion: Grammar-Aware Greybox Fuzzing

The paper "Superion: Grammar-Aware Greybox Fuzzing" introduces a novel approach to fuzz testing, particularly aimed at improving the efficacy of fuzzing structured inputs such as XML and JavaScript. Traditional coverage-based greybox fuzzers like American Fuzzy Lop (AFL) have shown substantial success in identifying vulnerabilities within applications that process unstructured test inputs. However, their performance diminishes considerably when faced with structured inputs due to their grammar-blind mutation and trimming strategies.

Proposition

The authors address these limitations by proposing a grammar-aware approach to greybox fuzzing. Superion, their proposed extension to AFL, adopts a series of grammar-aware techniques. These techniques involve parsing test inputs into abstract syntax trees (ASTs) and using these trees to inform both trimming and mutation processes. The intention behind this methodology is to retain the syntactic validity of test inputs while expanding the breadth and depth of fuzzing exploration.

  1. Grammar-aware Trimming: The proposed trimming strategy relies on ASTs to prune test inputs incrementally while maintaining their grammar. This approach counters the pitfalls of conventional strategies which can inadvertently alter the structure of the input, thus rendering large sections of testing ineffective.
  2. Grammar-aware Mutation: Superion introduces two mutation strategies:
    • Enhanced dictionary-based mutations utilize grammatically significant tokens and strategically apply them to input boundaries identified using the grammar rules.
    • Tree-based mutations leverage the AST representation, replacing subtrees within the input to create new, potentially interesting test inputs.

Evaluation

The effectiveness of Superion was evaluated against real-world programs including XML engine libplist and JavaScript engines such as WebKit, Jerryscript, and ChakraCore. The results were promising, demonstrating:

  • An improvement in line and function coverage by 16.7% and 8.8%, respectively, compared to AFL.
  • A significant enhancement in bug-finding capability, discovering 31 new bugs, including 21 vulnerabilities not caught by AFL.
  • Receipt of bug bounty rewards amounting to 3.2K USD, affirming the practical effectiveness of Superion in real-world settings.

Implications and Future Work

The implications of Superion extend beyond just providing a more effective tool for fuzzing structured inputs. It highlights the necessity of incorporating explicit grammatical awareness into fuzzing strategies to enhance their performance on grammar-bound formats. The structured format awareness could potentially be generalized to other domains where standards and protocols define the input structure, suggesting broader applicability.

Looking forward, future work could delve into integrating automatic grammar inference methods to enable applications where formal grammars are not readily available or are proprietary. Other possible avenues include the optimization of grammar-aware parsing and mutation operations to further reduce performance overhead and the exploration of adaptive mutation techniques that could decrease the time spent on less effective fuzzing efforts.

In summary, the introduction of Superion signifies a meaningful stride in fuzzer development, elegantly marrying traditional coverage-based methods with grammar awareness to handle structured inputs more adeptly. This research lays a foundation for future endeavours in grammar-aware fuzzing, potentially setting a new standard in testing methodologies for structured input applications.