Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TrafficGPT: Breaking the Token Barrier for Efficient Long Traffic Analysis and Generation (2403.05822v2)

Published 9 Mar 2024 in cs.LG

Abstract: Over the years, network traffic analysis and generation have advanced significantly. From traditional statistical methods, the field has progressed to sophisticated deep learning techniques. This progress has improved the ability to detect complex patterns and security threats, as well as to test and optimize network performance. However, obstacles persist, such as the dependence on labeled data for analysis and the difficulty of generating traffic samples that follow realistic patterns. Pre-trained deep neural networks have emerged as powerful tools to resolve these issues, offering improved performance by learning robust data representations from large unlabeled datasets. Despite their benefits, existing pre-trained models face challenges like token length limitation, which restricts their usefulness in comprehensive traffic analysis and realistic traffic generation. To address these challenges, we introduce TrafficGPT, a deep learning model that can tackle complex challenges related to long flow classification and generation tasks. This model uses generative pre-training with the linear attention mechanism, which allows for a substantially increased capacity of up to 12,032 tokens from the previous limit of only 512 tokens. TrafficGPT demonstrates superior performance in classification tasks, reaching state-of-the-art levels. In generation tasks, it closely resembles real traffic flows, with low JS divergence and an F1 score close to 0.5 (representing a random guess) in discriminating generated data. These advancements hold promise for future applications in both traffic flow classification and generation tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. E. Papadogiannaki and S. Ioannidis, “A survey on encrypted network traffic analysis applications, techniques, and countermeasures,” ACM Computing Surveys (CSUR), 2021.
  2. M. Abbasi, A. Shahraki, and A. Taherkordi, “Deep learning for network traffic monitoring and analysis (NTMA): A survey,” Computer Communications, 2021.
  3. D. Javaheri, S. Gorgin, J.-A. Lee, and M. Masdari, “Fuzzy logic-based DDoS attacks and network traffic anomaly detection methods: Classification, overview, and future perspectives,” Information Sciences, 2023.
  4. Y. Yin, Z. Lin, M. Jin, G. Fanti, and V. Sekar, “Practical gan-based synthetic ip header trace generation using netshare,” in Proc. ACM SIGCOMM, 2022.
  5. O. A. Adeleke, N. Bastin, and D. Gurkan, “Network traffic generation: A survey and methodology,” ACM Computing Surveys (CSUR), 2022.
  6. J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable website fingerprinting technique,” in Proc. USENIX Security, 2016.
  7. K. Al-Naami, S. Chandra, A. Mustafa, L. Khan, Z. Lin, K. Hamlen, and B. Thuraisingham, “Adaptive encrypted traffic fingerprinting with bi-directional dependence,” in Proc. ACM ACSAC, 2016.
  8. V. F. Taylor, R. Spolaor, M. Conti, and I. Martinovic, “Robust smartphone app identification via encrypted network traffic analysis,” IEEE Transactions on Information Forensics and Security, 2017.
  9. V. Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen, “Automated website fingerprinting through deep learning,” arXiv preprint arXiv:1708.06376, 2017.
  10. X. Liu, J. You, Y. Wu, T. Li, L. Li, Z. Zhang, and J. Ge, “Attention-based bidirectional GRU networks for efficient HTTPS traffic classification,” Information Sciences, 2020.
  11. Y. Luo, X. Chen, N. Ge, W. Feng, and J. Lu, “Transformer-Based Device-Type Identification in Heterogeneous IoT Traffic,” IEEE Internet of Things Journal, 2022.
  12. Z. Song, Z. Zhao, F. Zhang, G. Xiong, G. Cheng, X. Zhao, S. Guo, and B. Chen, “I2RNN: An Incremental and Interpretable Recurrent Neural Network for Encrypted Traffic Classification,” IEEE Transactions on Dependable and Secure Computing, 2023.
  13. J. Qu, X. Ma, J. Li, X. Luo, L. Xue, J. Zhang, Z. Li, L. Feng, and X. Guan, “An Input-Agnostic Hierarchical Deep Learning Framework for Traffic Fingerprinting,” in Proc. USENIX Security, 2023, pp. 589–606.
  14. X. Lin, G. Xiong, G. Gou, Z. Li, J. Shi, and J. Yu, “Et-bert: A contextualized datagram representation with pre-training transformers for encrypted traffic classification,” in Proc. ACM WWW, 2022.
  15. Q. Wang, C. Qian, X. Li, Z. Yao, and H. Shao, “Lens: A Foundation Model for Network Traffic,” arXiv preprint arXiv:2402.03646, 2024.
  16. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, 2017.
  17. X. Meng, C. Lin, Y. Wang, and Y. Zhang, “NetGPT: Generative Pretrained Transformer for Network Traffic,” arXiv preprint arXiv:2304.09513, 2023.
  18. R. Zhao, M. Zhan, X. Deng, Y. Wang, Y. Wang, G. Gui, and Z. Xue, “Yet another traffic classifier: a masked autoencoder based traffic transformer with multi-level flow representation,” in Proc. AAAI, 2023.
  19. S. Guthula, N. Battula, R. Beltiukov, W. Guo, and A. Gupta, “netFound: Foundation Model for Network Security,” arXiv preprint arXiv:2310.17025, 2023.
  20. H. Y. He, Z. G. Yang, and X. N. Chen, “PERT: Payload encoding representation from transformer for encrypted traffic classification,” in Proc. IEEE ITU K, 2020.
  21. J. Yan, H. F. Alan, and J. Kaur, “Fingerprinting Search Keywords over HTTPS at Scale,” arXiv preprint arXiv:2008.08161, 2020.
  22. J. Holland, P. Schmitt, N. Feamster, and P. Mittal, “New directions in automated traffic analysis,” in Proc. ACM SIGSAC, 2021.
  23. K. Lin, X. Xu, and F. Xiao, “MFFusion: A multi-level features fusion model for malicious traffic detection based on deep learning,” Computer Networks, 2022.
  24. J. Li, H. Zhou, S. Wu, X. Luo, T. Wang, X. Zhan, and X. Ma, “{{\{{FOAP}}\}}:{{\{{Fine-Grained}}\}}{{\{{Open-World}}\}} Android App Fingerprinting,” in Proc. USENIX Security, 2022.
  25. X. Guan, R. Du, X. Wang, and H. Qu, “A Personalized Federated Multi-task Learning Scheme for Encrypted Traffic Classification,” in Proc. Springer ICANN, 2023.
  26. J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar et al., “Bootstrap your own latent-a new approach to self-supervised learning,” Advances in neural information processing systems, 2020.
  27. R. Xie, Y. Wang, J. Cao, E. Dong, M. Xu, K. Sun, Q. Li, L. Shen, and M. Zhang, “Rosetta: Enabling robust tls encrypted traffic classification in diverse network environments with tcp-aware traffic augmentation,” in Proc. ACM TURC, 2023.
  28. M. Ring, D. Schlör, D. Landes, and A. Hotho, “Flow-based network traffic generation using generative adversarial networks,” Computers & Security, 2019.
  29. A. Cheng, “PAC-GAN: Packet generation of network traffic using generative adversarial networks,” in Proc. IEEE IEMCON, 2019.
  30. L. D. Manocchio, S. Layeghy, and M. Portmann, “Flowgan-synthetic network flow generation using generative adversarial networks,” in Proc. IEEE CSE, 2021.
  31. L. Fan and A. Pokkunuru, “DPNeT: Differentially private network traffic synthesis with generative adversarial networks,” in Proc. Springer DBSec, 2021.
  32. B.-E. Zolbayar, R. Sheatsley, P. McDaniel, M. J. Weisman, S. Zhu, S. Zhu, and S. Krishnamurthy, “Generating practical adversarial network traffic flows using NIDSGAN,” arXiv preprint arXiv:2203.06694, 2022.
  33. S. Hui, H. Wang, Z. Wang, X. Yang, Z. Liu, D. Jin, and Y. Li, “Knowledge enhanced gan for IoT traffic generation,” in Proc. ACM WWW, 2022.
  34. L. Du, J. He, T. Li, Y. Wang, X. Lan, and Y. Huang, “DBWE-Corbat: Background network traffic generation using dynamic word embedding and contrastive learning for cyber range,” Computers & Security, 2023.
  35. D. Kim, M. Ko, S. Kim, S. Moon, K.-Y. Cheon, S. Park, Y. Kim, H. Yoon, and Y.-H. Choi, “Design and implementation of traffic generation model and spectrum requirement calculator for private 5G network,” IEEE Access, 2022.
  36. D. K. Kholgh and P. Kostakos, “PAC-GPT: A novel approach to generating synthetic network traffic with GPT-3,” IEEE Access, 2023.
  37. F. D. Keles, P. M. Wijewardena, and C. Hegde, “On the computational complexity of self-attention,” in Proc. PMLR ALT, 2023.
  38. Q. Guo, X. Qiu, P. Liu, Y. Shao, X. Xue, and Z. Zhang, “Star-transformer,” arXiv preprint arXiv:1902.09113, 2019.
  39. I. Beltagy, M. E. Peters, and A. Cohan, “Longformer: The long-document transformer,” arXiv preprint arXiv:2004.05150, 2020.
  40. M. Zaheer, G. Guruganesh, K. A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, P. Pham, A. Ravula, Q. Wang, L. Yang et al., “Big bird: Transformers for longer sequences,” Advances in neural information processing systems, 2020.
  41. A. Roy, M. Saffar, A. Vaswani, and D. Grangier, “Efficient content-based sparse attention with routing transformers,” Transactions of the Association for Computational Linguistics, 2021.
  42. A. Katharopoulos, A. Vyas, N. Pappas, and F. Fleuret, “Transformers are rnns: Fast autoregressive transformers with linear attention,” in International conference on machine learning.   PMLR, 2020.
  43. J. Lee, Y. Lee, J. Kim, A. Kosiorek, S. Choi, and Y. W. Teh, “Set transformer: A framework for attention-based permutation-invariant neural networks,” in International conference on machine learning.   PMLR, 2019.
  44. S. Wang, B. Z. Li, M. Khabsa, H. Fang, and H. Ma, “Linformer: Self-attention with linear complexity,” arXiv preprint arXiv:2006.04768, 2020.
  45. H. Zhang, Y. Gong, Y. Shen, W. Li, J. Lv, N. Duan, and W. Chen, “Poolingformer: Long document modeling with pooling attention,” in Proc. PMLR ICML, 2021.
  46. Q. Guo, X. Qiu, X. Xue, and Z. Zhang, “Low-rank and locality constrained self-attention for sequence modeling,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019.
  47. X. Fan, Z. Liu, J. Lian, W. X. Zhao, X. Xie, and J.-R. Wen, “Lighter and better: low-rank decomposed self-attention networks for next-item recommendation,” in Proc. ACM SIGIR, 2021.
  48. N. Kitaev, Ł. Kaiser, and A. Levskaya, “Reformer: The efficient transformer,” arXiv preprint arXiv:2001.04451, 2020.
  49. B. Peng, E. Alcaide, Q. Anthony, A. Albalak, S. Arcadinho, H. Cao, X. Cheng, M. Chung, M. Grella, K. K. GV et al., “RWKV: Reinventing RNNs for the Transformer Era,” arXiv preprint arXiv:2305.13048, 2023.
  50. Y. Sun, L. Dong, S. Huang, S. Ma, Y. Xia, J. Xue, J. Wang, and F. Wei, “Retentive network: A successor to transformer for large language models,” arXiv preprint arXiv:2307.08621, 2023.
  51. W. Xiong, B. Oğuz, A. Gupta, X. Chen, D. Liskovich, O. Levy, W.-t. Yih, and Y. Mehdad, “Simple local attentions remain competitive for long-context tasks,” arXiv preprint arXiv:2112.07210, 2021.
  52. A. H. Lashkari, G. D. Gil, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of tor traffic using time based features,” in Proc. SciTePress ICISSP, 2017.
  53. W. Wang and D. Lu, “USTC-TFC2016,” [Online], 2016, Available:https://github.com/yungshenglu/USTC-TFC2016.
  54. G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of encrypted and vpn traffic using time-related,” in Proc. SciTePress ICISSP, 2016.
  55. M. MontazeriShatoori, L. Davidson, G. Kaur, and A. H. Lashkari, “Detection of doh tunnels using time-series classification of encrypted traffic,” in Proc. IEEE DASC/PiCom/CBDCom/CyberSciTech, 2020.
  56. S. Dadkhah, H. Mahdikhani, P. K. Danso, A. Zohourian, K. A. Truong, and A. A. Ghorbani, “Towards the development of a realistic multidimensional IoT profiling dataset,” in Proc. IEEE PST, 2022.
  57. C. Liu, L. He, G. Xiong, Z. Cao, and Z. Li, “Fs-net: A flow sequence network for encrypted traffic classification,” in Proc. IEEE INFOCOM, 2019.
  58. T. Van Ede, R. Bortolameotti, A. Continella, J. Ren, D. J. Dubois, M. Lindorfer, D. Choffnes, M. van Steen, and A. Peter, “Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traffic,” in Network and distributed system security symposium (NDSS), vol. 27, 2020.
  59. K. Lin, X. Xu, and H. Gao, “TSCRNN: A novel classification scheme of encrypted traffic based on flow spatiotemporal features for efficient management of IIoT,” Computer Networks, 2021.
  60. K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proc. ACL, 2002.
  61. “Rouge: A package for automatic evaluation of summaries, author=Lin, Chin-Yew,” in Text summarization branches out, 2004.
  62. S. Dong, Z. Li, D. Tang, J. Chen, M. Sun, and K. Zhang, “Your smart home can’t keep a secret: Towards automated fingerprinting of iot traffic,” in Proc. ACM ASIACCS, 2020.
  63. B. Bezawada, I. Ray, and I. Ray, “Behavioral fingerprinting of Internet-of-Things devices,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2021.
  64. L. Barman, A. Dumur, A. Pyrgelis, and J.-P. Hubaux, “Every byte matters: Traffic analysis of bluetooth wearable devices,” Proc. ACM IMWUT, 2021.
  65. L. Babun, H. Aksu, L. Ryan, K. Akkaya, E. S. Bentley, and A. S. Uluagac, “Z-iot: Passive device-class fingerprinting of zigbee and z-wave iot devices,” in Proc. IEEE ICC, 2020.
  66. N. Shafqat, D. J. Dubois, D. Choffnes, A. Schulman, D. Bharadia, and A. Ranganathan, “Zleaks: Passive inference attacks on Zigbee based smart homes,” in Proc. Springer ACNS, 2022.
  67. K. Cheng, C. Cheng, L. Zhang, J. Chen, and W. Luo, “Fingerprint Recognition and Classification of IoT Devices Based on Z-Wave,” in Proc. VDE CIBDA, 2022.
  68. Y. Sun, L. Dong, B. Patra, S. Ma, S. Huang, A. Benhaim, V. Chaudhary, X. Song, and F. Wei, “A length-extrapolatable transformer,” arXiv preprint arXiv:2212.10554, 2022.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets