Level Up with ML Vulnerability Identification: Leveraging Domain Constraints in Feature Space for Robust Android Malware Detection (2205.15128v4)
Abstract: Machine Learning (ML) promises to enhance the efficacy of Android Malware Detection (AMD); however, ML models are vulnerable to realistic evasion attacks--crafting realizable Adversarial Examples (AEs) that satisfy Android malware domain constraints. To eliminate ML vulnerabilities, defenders aim to identify susceptible regions in the feature space where ML models are prone to deception. The primary approach to identifying vulnerable regions involves investigating realizable AEs, but generating these feasible apps poses a challenge. For instance, previous work has relied on generating either feature-space norm-bounded AEs or problem-space realizable AEs in adversarial hardening. The former is efficient but lacks full coverage of vulnerable regions while the latter can uncover these regions by satisfying domain constraints but is known to be time-consuming. To address these limitations, we propose an approach to facilitate the identification of vulnerable regions. Specifically, we introduce a new interpretation of Android domain constraints in the feature space, followed by a novel technique that learns them. Our empirical evaluations across various evasion attacks indicate effective detection of AEs using learned domain constraints, with an average of 89.6%. Furthermore, extensive experiments on different Android malware detectors demonstrate that utilizing our learned domain constraints in Adversarial Training (AT) outperforms other AT-based defenses that rely on norm-bounded AEs or state-of-the-art non-uniform perturbations. Finally, we show that retraining a malware detector with a wide variety of feature-space realizable AEs results in a 77.9% robustness improvement against realizable AEs generated by unknown problem-space transformations, with up to 70x faster training than using problem-space realizable AEs.
- Droiddet: effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing, 272:638–646, 2018.
- Droiddetector: Android malware characterization and detection using deep learning. Tsinghua Science and Technology, 21(1):114–123, 2016.
- Mamadroid: Detecting android malware by building markov chains of behavioral models (extended version). ACM Transactions on Privacy and Security (TOPS), 22(2):1–34, 2019.
- Drebin: Effective and explainable detection of android malware in your pocket. In Network and Distributed System Security Symposium (NDSS), 2014.
- Win Zaw Zarni Aung. Permission-based android malware detection. International Journal of Scientific & Technology Research, 2(3):228–234, 2013.
- Droidapiminer: Mining api-level features for robust malware detection in android. In International conference on security and privacy in communication systems, pages 86–103. Springer, 2013.
- Structural detection of android malware using embedded call graphs. In Proceedings of the 2013 ACM workshop on Artificial intelligence and security, pages 45–54, 2013.
- Droidmat: Android malware detection through manifest and api calls tracing. In 2012 Seventh Asia Joint Conference on Information Security, pages 62–69. IEEE, 2012.
- Bo Li and Yevgeniy Vorobeychik. Evasion-robust classification on binary domains. ACM Transactions on Knowledge Discovery from Data (TKDD), 12(4):1–32, 2018.
- Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2014.
- A unified framework for adversarial attack and defense in constrained feature space. arXiv preprint arXiv:2112.01156, 2021.
- Adversarial examples in constrained domains. arXiv preprint arXiv:2011.01183, 2020.
- Intriguing properties of adversarial ml attacks in the problem space. In 2020 IEEE Symposium on Security and Privacy (SP), pages 1332–1349. IEEE, 2020.
- Realizable universal adversarial perturbations for malware. arXiv preprint arXiv:2102.06747, 2021.
- Yes, machine learning can be more secure! a case study on android malware detection. IEEE Transactions on Dependable and Secure Computing, 16(4):711–724, 2017.
- Adversarial deep ensemble: Evasion attacks and defenses for malware detection. IEEE Transactions on Information Forensics and Security, 15:3886–3900, 2020.
- A framework for enhancing deep neural networks against adversarial malware. IEEE Transactions on Network Science and Engineering, 8(1):736–750, 2021.
- Robust android malware detection system against adversarial attacks using q-learning. Information Systems Frontiers, 23(4):867–882, 2021.
- Adversarial examples for malware detection. In European symposium on research in computer security, pages 62–79. Springer, 2017.
- Securedroid: Enhancing security of machine learning-based detection against adversarial android malware attacks. In Proceedings of the 33rd Annual Computer Security Applications Conference, pages 362–372, 2017.
- Droideye: Fortifying security of learning-based classifier against adversarial android malware attacks. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 782–789. IEEE, 2018.
- Adversarial samples on android malware detection systems for iot systems. Sensors, 19(4):974, 2019.
- Ofei: A semi-black-box android adversarial sample attack framework against dlaas. arXiv preprint arXiv:2105.11593, 2021.
- When the guard failed the droid: A case study of android malware. arXiv preprint arXiv:2003.14123, 2020.
- Do you think you can hold me? the real challenge of problem-space evasion attacks. arXiv preprint arXiv:2205.04293, 2022.
- On the empirical effectiveness of unrealistic adversarial hardening against realistic adversarial attacks. In 2023 IEEE symposium on security and privacy (SP), 2023.
- Feature-space bayesian adversarial learning improved malware detector robustness. In AAAI, 2023.
- Malware detection in adversarial settings: Exploiting feature evolutions and confusions in android apps. In Proceedings of the 33rd Annual Computer Security Applications Conference (ACSAC), pages 288–302, 2017.
- Evadedroid: A practical evasion attack on machine learning for black-box android malware detection. arXiv preprint arXiv:2110.03301, 2021.
- Android hiv: A study of repackaging malware for evading machine-learning detection. IEEE Transactions on Information Forensics and Security, 15:987–1001, 2019.
- On the feasibility of adversarial sample creation using the android system api. Information, 11(9):433, 2020.
- Generic black-box end-to-end attack against state of the art api call based malware classifiers. In International Symposium on Research in Attacks, Intrusions, and Defenses, pages 490–510. Springer, 2018.
- Exploring data correlation between feature pairs for generating constraint-based adversarial examples. In 2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS), pages 430–437. IEEE, 2020.
- Supervised pattern classification based on optimum-path forest. International Journal of Imaging Systems and Technology, 19(2):120–131, 2009.
- Adversarial robustness with non-uniform perturbations. Advances in Neural Information Processing Systems (NeurIPS), 34:19147–19159, 2021.
- Sparse-rs: a versatile framework for query-efficient sparse black-box adversarial attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 6437–6445, 2022.
- Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (SP), pages 39–57. IEEE, 2017.
- The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P), pages 372–387. IEEE, 2016.
- Towards deep learning models resistant to adversarial attacks. In 2018 International Conference on Learning Representations (ICLR), 2018.
- On the robustness of domain constraints. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 495–515, 2021.
- Subverting network intrusion detection: Crafting adversarial examples accounting for domain-specific constraints. In International Cross-Domain Conference for Machine Learning and Knowledge Extraction, pages 301–320. Springer, 2020.
- Fence: Feasible evasion attacks on neural networks in constrained environments. arXiv preprint arXiv:1909.10480, 2019.
- Improving robustness of {{\{{ML}}\}} classifiers against realizable evasion attacks using conserved features. In 28th USENIX Security Symposium (USENIX Security 19), pages 285–302, 2019.
- Adversarial deep learning for robust detection of binary encoded malware. In 2018 IEEE Security and Privacy Workshops (SPW), pages 76–82. IEEE, 2018.
- Generating adversarial malware examples for black-box attacks based on gan. arXiv preprint arXiv:1702.05983, 2017.
- Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, pages 257–269, 2015.
- Evasion is not enough: A case study of android malware. In International Symposium on Cyber Security Cryptography and Machine Learning, pages 167–174. Springer, 2020.
- Statistics (international student edition). Pisani, R. Purves, 4th edn. WW Norton & Company, New York, 2007.
- A strong coreset algorithm to accelerate opf as a graph-based machine learning in large-scale problems. Information Sciences, 555:424–441, 2021.
- Data clustering as an optimum-path forest problem with applications in image analysis. International Journal of Imaging Systems and Technology, 19(2):50–68, 2009.
- Making machine learning robust against adversarial inputs. Communications of the ACM, 61(7):56–66, 2018.
- Are adversarial examples created equal? a learnable weighted minimax risk for robustness under non-uniform attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10815–10823, 2021.
- Adversarial training and robustness for multiple perturbations. Advances in neural information processing systems, 32, 2019.
- Explaining black-box android malware detection. In 2018 26th european signal processing conference (EUSIPCO), pages 524–528. IEEE, 2018.
- Automating large-scale data quality verification. Proceedings of the VLDB Endowment, 11(12):1781–1794, 2018.
- Characterizing concept drift. Data Mining and Knowledge Discovery, 30(4):964–994, 2016.
- Androzoo: Collecting millions of android apps for the research community. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), pages 468–471. IEEE, 2016.
- Hamid Bostani (5 papers)
- Zhengyu Zhao (43 papers)
- Zhuoran Liu (26 papers)
- Veelasha Moonsamy (14 papers)