Papers
Topics
Authors
Recent
2000 character limit reached

ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection (2402.18093v2)

Published 28 Feb 2024 in cs.CR

Abstract: The proliferation of phishing sites and emails poses significant challenges to existing cybersecurity efforts. Despite advances in malicious email filters and email security protocols, problems with oversight and false positives persist. Users often struggle to understand why emails are flagged as potentially fraudulent, risking the possibility of missing important communications or mistakenly trusting deceptive phishing emails. This study introduces ChatSpamDetector, a system that uses LLMs to detect phishing emails. By converting email data into a prompt suitable for LLM analysis, the system provides a highly accurate determination of whether an email is phishing or not. Importantly, it offers detailed reasoning for its phishing determinations, assisting users in making informed decisions about how to handle suspicious emails. We conducted an evaluation using a comprehensive phishing email dataset and compared our system to several LLMs and baseline systems. We confirmed that our system using GPT-4 has superior detection capabilities with an accuracy of 99.70%. Advanced contextual interpretation by LLMs enables the identification of various phishing tactics and impersonations, making them a potentially powerful tool in the fight against email-based phishing threats.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. 2007 TREC Public Spam Corpus (2007), https://plg.uwaterloo.ca/~gvcormac/treccorpus07/
  2. CSDMC Spam Corpus (2021), https://csmining.org/cdmc2021/datasets/
  3. dbsheta/spam-detection-using-deep-learning (2024), https://github.com/dbsheta/spam-detection-using-deep-learning
  4. Enron Email Dataset (2024), https://www.cs.cmu.edu/~enron/
  5. mo-messidi/Email-Phishing-Attempts-Detection-using-NLP (2024), https://github.com/mo-messidi/Email-Phishing-Attempts-Detection-using-NLP
  6. MoAbd/Spam-detection (2024), https://github.com/MoAbd/Spam-detection
  7. rf-peixoto/phishing_pot (2024), https://github.com/rf-peixoto/phishing_pot
  8. SpamAssassin public mail corpus (2024), https://spamassassin.apache.org/old/publiccorpus/
  9. VirusTotal. https://www.virustotal.com/ (2024)
  10. Google Cloud: Gemini API (2024), https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/gemini
  11. Google Workspace Blog: An overview of Gmail’s spam filters (2024), https://workspace.google.com/blog/identity-and-security/an-overview-of-gmails-spam-filters?hl=en
  12. Microsoft Azure: Azure OpenAI Service (2024), https://azure.microsoft.com/en-us/products/ai-services/openai-service
  13. Microsoft Support: Overview of the Junk Email Filter (2024), https://support.microsoft.com/en-us/office/overview-of-the-junk-email-filter-5ae3ea8e-cf41-4fa0-b02a-3b96e21de089
  14. Pilavakis, N., Jenkins, A., Kökciyan, N., Vaniea, K.: “i didn’t click”: What users say when reporting phishing. In: USEC 2023 (2023)
Citations (15)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 2 tweets with 8 likes about this paper.

Youtube Logo Streamline Icon: https://streamlinehq.com