TY - JOUR
T1 - When a RF beats a CNN and GRU, together—A comparison of deep learning and classical machine learning approaches for encrypted malware traffic classification
AU - Lichy, Adi
AU - Bader, Ofek
AU - Dubin, Ran
AU - Dvir, Amit
AU - Hajaj, Chen
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2023/1
Y1 - 2023/1
N2 - Internet traffic classification plays a crucial role in Quality of Experience (QoE), Quality of Services (QoS), intrusion detection, and traffic-trend analyses. While there is no theoretical guarantee that deep learning (DL)-based solutions perform better than classic machine learning (ML)-based ones, DL-based models have become the common default. This paper compares well-known DL-based and ML-based models and shows that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset. Note that, it is not feasible to evaluate all possible models to make a concrete statement, thus the above finding is not a recommendation to avoid DL-based models, but rather an empirical finding that in some cases, there are more simplistic solutions, that may perform even better.
AB - Internet traffic classification plays a crucial role in Quality of Experience (QoE), Quality of Services (QoS), intrusion detection, and traffic-trend analyses. While there is no theoretical guarantee that deep learning (DL)-based solutions perform better than classic machine learning (ML)-based ones, DL-based models have become the common default. This paper compares well-known DL-based and ML-based models and shows that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset. Note that, it is not feasible to evaluate all possible models to make a concrete statement, thus the above finding is not a recommendation to avoid DL-based models, but rather an empirical finding that in some cases, there are more simplistic solutions, that may perform even better.
KW - Deep learning
KW - Encrypted traffic classification
KW - Machine learning
KW - Malware classification
KW - Malware detection
UR - http://www.scopus.com/inward/record.url?scp=85141515864&partnerID=8YFLogxK
U2 - 10.1016/j.cose.2022.103000
DO - 10.1016/j.cose.2022.103000
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85141515864
SN - 0167-4048
VL - 124
JO - Computers and Security
JF - Computers and Security
M1 - 103000
ER -