Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
9 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PMANet: Malicious URL detection via post-trained language model guided multi-level feature attention network (2311.12372v2)

Published 21 Nov 2023 in cs.CR

Abstract: The proliferation of malicious URLs has made their detection crucial for enhancing network security. While pre-trained LLMs offer promise, existing methods struggle with domain-specific adaptability, character-level information, and local-global encoding integration. To address these challenges, we propose PMANet, a pre-trained LLM-Guided multi-level feature attention network. PMANet employs a post-training process with three self-supervised objectives: masked LLMing, noisy LLMing, and domain discrimination, effectively capturing subword and character-level information. It also includes a hierarchical representation module and a dynamic layer-wise attention mechanism for extracting features from low to high levels. Additionally, spatial pyramid pooling integrates local and global features. Experiments on diverse scenarios, including small-scale data, class imbalance, and adversarial attacks, demonstrate PMANet's superiority over state-of-the-art models, achieving a 0.9941 AUC and correctly detecting all 20 malicious URLs in a case study. Code and data are available at https://github.com/Alixyvtte/Malicious-URL-Detection-PMANet.

Citations (4)

Summary

We haven't generated a summary for this paper yet.