Scopus Indexed Publications

Paper Details


Title
Robust Phishing URL Classification Using FastText Character Embeddings and Hybrid Deep Learning

Author
, Shafiur Rahman,

Email

Abstract

Phishing attacks are a major cybersecurity threat that resulted in over 1.2 million incidents in the first half of 2020. These attacks caused substantial financial losses and posed risks to individuals and organizations. Being able to identify fraudulent websites is crucial in order to effectively address these potential risks. This study introduces a novel method for detecting phishing URLs by using word and character embeddings to capture complex URL patterns. We used a dataset of 80,000 URLs, including 50,000 legitimate ones and 30,000 phishing instances, and applied thorough preprocessing techniques. We utilized word embeddings in FastText to handle unseen words, with the added advantage of n-gram representations. Additionally, we captured character-level features through dense character embeddings. We trained several machine learning and deep learning models, and one model, the Convolutional Bidirectional LSTM (CBiLSTM), stood out with an accuracy of 99.01% and an F1-score of 99.08%. Furthermore, we made a thorough comparison with the most advanced techniques available, and our findings demonstrated clear superiority over previous research. This study presents an effective approach for classifying phishing URLs, providing a valuable tool to combat fraud and protect against identity theft, thereby helping to minimize the financial and emotional harm experienced by victims. 


Keywords

Journal or Conference Name
2024 IEEE 3rd International Conference on Robotics, Automation, Artificial-Intelligence and Internet-of-Things, RAAICON 2024 - Proceedings

Publication Year
2024

Indexing
scopus