Optimizing Away JavaScript Obfuscation (2009.09170v1)

Published 19 Sep 2020 in cs.CR

Abstract: JavaScript is a popular attack vector for releasing malicious payloads on unsuspecting Internet users. Authors of this malicious JavaScript often employ numerous obfuscation techniques in order to prevent the automatic detection by antivirus and hinder manual analysis by professional malware analysts. Consequently, this paper presents SAFE-Deobs, a JavaScript deobfuscation tool that we have built. The aim of SAFE-Deobs is to automatically deobfuscate JavaScript malware such that an analyst can more rapidly determine the malicious script's intent. This is achieved through a number of static analyses, inspired by techniques from compiler theory. We demonstrate the utility of SAFE-Deobs through a case study on real-world JavaScript malware, and show that it is a useful addition to a malware analyst's toolset.

Citations (4)

View on Semantic Scholar

Summary

The paper introduces SAFE-Deobs, a tool that repurposes compiler optimization techniques to deobfuscate JavaScript malware by transforming obfuscated scripts into an abstract syntax tree.
It employs a systematic methodology—including constant folding, propagation, dead branch removal, function inlining, string decoding, and variable renaming—to simplify and clarify obfuscated code.
Evaluations on over 39,000 samples demonstrate significant reductions in code complexity metrics, enhancing the efficiency and accuracy of malware analysis.

Optimizing Away JavaScript Obfuscation: An Overview

This essay provides a comprehensive analysis of the paper "Optimizing Away JavaScript Obfuscation" (2009.09170), which introduces the SAFE-Deobs tool for deobfuscating JavaScript malware. This tool leverages static analysis techniques to assist malware analysts in understanding obfuscated scripts, by applying methodologies inspired by compiler theory. The paper's contributions lie in the application and adaptation of these techniques, which have traditionally focused on optimization, to the domain of malware analysis.

Introduction to JavaScript Obfuscation

JavaScript's prevalence in modern web applications also makes it a prime target for malicious exploitation. Obfuscation is a common method used by script authors to evade detection and complicate the analysis process. Deobfuscation tools must therefore not only reverse obfuscation but enhance the readability of the resultant script. The introduction of SAFE-Deobs addresses this need through static analyses derived from established compiler optimization techniques.

SAFE-Deobs Design and Implementation

The workflow of SAFE-Deobs centers around the transformation of obfuscated JavaScript into an Abstract Syntax Tree (AST), which is processed using a series of deobfuscation phases that closely mirror compiler optimizations. These include constant folding, constant propagation, dead-branch removal, function inlining, string decoding, and variable renaming.

Figure 1: The SAFE-Deobs workflow.

Deobfuscation Phases

Constant Folding: Identifies constant expressions and collapses them to simpler forms within the AST, using pattern-matching for efficient transformations.
Constant Propagation: Substitutes known constant values within expressions, maintaining a stateful representation to ensure correctness across multiple AST traversals.
Dead-Branch Removal: Removes branches of code that are determined to be never executed, simplifying the control flow for better readability and analysis.
Function Inlining: Expands trivial function calls, particularly those that conduct simple operations and direct literal returns, thus reducing the complexity of the script.
String Decoding: Targets encoded strings, such as those using hexadecimal or unicode representations, and decodes them to their human-readable form.
Variable Renaming: Changes variables with complex or confusing names to more intuitive terms, further facilitating the understanding process by analysts.

Evaluation and Implications

SAFE-Deobs' efficacy is demonstrated through a detailed case paper and a broader evaluation on a large dataset of obfuscated JavaScript samples. The case paper highlights the tool's capability in significantly reducing script complexity and obfuscation, making it straightforward to identify malicious intentions behind the script. Across a dataset of over 39,000 samples, the tool consistently decreased metrics such as lines of code, cyclomatic complexity, and Halstead length, confirming its utility in streamlining the analysis workflow for malware analysts.

Theoretical and Practical Implications

The approach detailed in the paper holds implications for both researchers and practitioners. Theoretically, it exemplifies the adaptability of compiler optimization techniques to contexts beyond performance improvement. Practically, SAFE-Deobs provides an open-source, reproducible tool that integrates into existing analytical toolchains for malware analysis, potentially enhancing the speed and accuracy of manual and automated script assessments.

Conclusion

"Optimizing Away JavaScript Obfuscation" provides a valuable contribution to the field of malware analysis, specifically in dealing with JavaScript obfuscation. By repurposing compiler techniques for deobfuscation, SAFE-Deobs offers a robust tool that assists analysts in dissecting and understanding obfuscated scripts, ultimately advancing the methodologies available for tackling malicious code. The open-source nature of the tool further encourages integration and further research within the broader security community.