A Hybrid Phishing Detection System Using Deep Learning-based URL and Content Analysis

Mehmet  Korkmaz; Emre  Kocyigit; Ozgur Koray Sahingoz; Banu Diri

doi:10.5755/j02.eie.31197

A Hybrid Phishing Detection System Using Deep Learning-based URL and Content Analysis

Authors

Mehmet Korkmaz Department of Computer Engineering, Yildiz Technical University, Turkey
Emre Kocyigit Department of Computer Engineering, Yildiz Technical University, Turkey
Ozgur Koray Sahingoz Department of Computer Engineering, Biruni University, Turkey
Banu Diri Department of Computer Engineering, Yildiz Technical University, Turkey

DOI:

https://doi.org/10.5755/j02.eie.31197

Keywords:

Phishing detection, Deep learning, URL-based, Content-based, Two-stage hybrid system, High-risk dataset

Abstract

Phishing attacks are one of the most preferred types of attacks for cybercriminals, who can easily contact a large number of victims through the use of social networks, particularly through email messages. To protect end users, most of the security mechanisms control Uniform Resource Locator (URL) addresses because of their simplicity of implementation and execution speed. However, due to sophisticated attackers, this mechanism can miss some phishing attacks and has a relatively high false positive rate. In this research, a hybrid technique is proposed that uses not only URL features, but also content-based features as the second level of detection mechanism, thus improving the accuracy of the detection system while also minimizing the number of false positives. Additionally, most phishing detection algorithms use datasets that contain easily differentiated data pieces, either phishing or legitimate. However, in order to implement a more secure protection mechanism, we aimed to collect a larger and high-risk dataset. The proposed approaches were tested on this High-Risk URL and Content-Based Phishing Detection Dataset that only contains suspicious websites from PhishTank. According to experimental studies, an accuracy rate of 98.37 percent was achieved on a more realistic dataset for phishing detection.

Downloads

Published

2022-10-26

Issue

Vol. 28 No. 5 (2022)

Section

TELECOMMUNICATIONS ENGINEERING

License

The copyright for the paper in this journal is retained by the author(s) with the first publication right granted to the journal. The authors agree to the Creative Commons Attribution 4.0 (CC BY 4.0) agreement under which the paper in the Journal is licensed.

By virtue of their appearance in this open access journal, papers are free to use with proper attribution in educational and other non-commercial settings with an acknowledgement of the initial publication in the journal.

How to Cite

Korkmaz, M. ., Kocyigit, E. ., Sahingoz, O. K., & Diri, B. (2022). A Hybrid Phishing Detection System Using Deep Learning-based URL and Content Analysis. Elektronika Ir Elektrotechnika, 28(5), 80-89. https://doi.org/10.5755/j02.eie.31197