TY - JOUR
T1 - Content Disarm and Reconstruction of PDF Files
AU - Dubin, Ran
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - Content Disarm and Reconstruction (CDR) is a zero-trust file methodology that proactively extracts threat attack vectors from documents and media files. While extensive literature on CDR emphasizes its importance, a detailed discussion of how the CDR process works, its effectiveness, and its drawbacks is not presented. Therefore, this paper presents PdfCDR, the first PDF CDR system in which the validation, the prevention rate, and the received visual similarity effect of disarming and reconstruction are presented and measured. Furthermore, PdfCDR suggests for the first time a novel method dealing with new emerging exploits by automatically converting detection rules to disarm and reconstruction rules. As a result, PdfCDR can prevent evasive attacks without any software upgrades and utilize the cyber security community knowledge to prevent cyber attacks as soon as they are advertised. The effectiveness of the novel PdfCDR against well-known PDF datasets shows that it disarmed not only the malicious components, but the reconstructed file is also usable and functional. However, since CDR relies on understanding the file format, any CDR solution should handle each supported file type separately due to the vast difference in each file format. Hence, this paper focuses on the Portable Document Format (PDF) file type that attackers commonly exploit. The results indicate that PdfCDR successfully CDR 90% of the malicious files while the remaining 10% were encrypted or had abnormal structures compared to the standard and were quarantined.
AB - Content Disarm and Reconstruction (CDR) is a zero-trust file methodology that proactively extracts threat attack vectors from documents and media files. While extensive literature on CDR emphasizes its importance, a detailed discussion of how the CDR process works, its effectiveness, and its drawbacks is not presented. Therefore, this paper presents PdfCDR, the first PDF CDR system in which the validation, the prevention rate, and the received visual similarity effect of disarming and reconstruction are presented and measured. Furthermore, PdfCDR suggests for the first time a novel method dealing with new emerging exploits by automatically converting detection rules to disarm and reconstruction rules. As a result, PdfCDR can prevent evasive attacks without any software upgrades and utilize the cyber security community knowledge to prevent cyber attacks as soon as they are advertised. The effectiveness of the novel PdfCDR against well-known PDF datasets shows that it disarmed not only the malicious components, but the reconstructed file is also usable and functional. However, since CDR relies on understanding the file format, any CDR solution should handle each supported file type separately due to the vast difference in each file format. Hence, this paper focuses on the Portable Document Format (PDF) file type that attackers commonly exploit. The results indicate that PdfCDR successfully CDR 90% of the malicious files while the remaining 10% were encrypted or had abnormal structures compared to the standard and were quarantined.
KW - Adobe PDF
KW - CDR
KW - attack prevention
KW - malware
KW - sensitization
KW - threat disarm
KW - zero-trust
UR - http://www.scopus.com/inward/record.url?scp=85153512663&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3267717
DO - 10.1109/ACCESS.2023.3267717
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85153512663
SN - 2169-3536
VL - 11
SP - 38399
EP - 38416
JO - IEEE Access
JF - IEEE Access
ER -