Summary |
As the Internet has become an essential part of human beings’ lives, a growing number of people are enjoying the convenience brought by the Internet, while more are attacks coming from on the dark side of the Internet. Based on some weaknesses of human nature, hackers have designed confusing phishing pages to entice web viewers to proactively expose their privacy, sensitive information. In this study, we propose an URL-based detection system - combining the URL, content and the web page source code as features, import Levenshtein Distance as the algorithm for calculating the similarity of strings and classified by the machine learning architecture. The system is designed to provide high accuracy and low false positive rate detection results for unknown phishing pages. The system is designed to provide high accuracy and low false positive rate detection results for unknown phishing pages. |