Javascript must be enabled to continue!
Explainable Ensemble Learning Based Detection of Evasive Malicious PDF Documents
View through CrossRef
PDF has become a major attack vector for delivering malware and compromising systems and networks, due to its popularity and widespread usage across platforms. PDF provides a flexible file structure that facilitates the embedding of different types of content such as JavaScript, encoded streams, images, executable files, etc. This enables attackers to embed malicious code as well as to hide their functionalities within seemingly benign non-executable documents. As a result, a large proportion of current automated detection systems are unable to effectively detect PDF files with concealed malicious content. To mitigate this problem, a novel approach is proposed in this paper based on ensemble learning with enhanced static features, which is used to build an explainable and robust malicious PDF document detection system. The proposed system is resilient against reverse mimicry injection attacks compared to the existing state-of-the-art learning-based malicious PDF detection systems. The recently released EvasivePDFMal2022 dataset was used to investigate the efficacy of the proposed system. Based on this dataset, an overall classification accuracy greater than 98% was observed with five ensemble learning classifiers. Furthermore, the proposed system, which employs new anomaly-based features, was evaluated on a reverse mimicry attack dataset containing three different types of content injection attacks, i.e., embedded JavaScript, embedded malicious PDF, and embedded malicious EXE. The experiments conducted on the reverse mimicry dataset showed that the Random Committee ensemble learning model achieved 100% detection rates for embedded EXE and embedded JavaScript, and 98% detection rate for embedded PDF, based on our enhanced feature set.
Title: Explainable Ensemble Learning Based Detection of Evasive Malicious PDF Documents
Description:
PDF has become a major attack vector for delivering malware and compromising systems and networks, due to its popularity and widespread usage across platforms.
PDF provides a flexible file structure that facilitates the embedding of different types of content such as JavaScript, encoded streams, images, executable files, etc.
This enables attackers to embed malicious code as well as to hide their functionalities within seemingly benign non-executable documents.
As a result, a large proportion of current automated detection systems are unable to effectively detect PDF files with concealed malicious content.
To mitigate this problem, a novel approach is proposed in this paper based on ensemble learning with enhanced static features, which is used to build an explainable and robust malicious PDF document detection system.
The proposed system is resilient against reverse mimicry injection attacks compared to the existing state-of-the-art learning-based malicious PDF detection systems.
The recently released EvasivePDFMal2022 dataset was used to investigate the efficacy of the proposed system.
Based on this dataset, an overall classification accuracy greater than 98% was observed with five ensemble learning classifiers.
Furthermore, the proposed system, which employs new anomaly-based features, was evaluated on a reverse mimicry attack dataset containing three different types of content injection attacks, i.
e.
, embedded JavaScript, embedded malicious PDF, and embedded malicious EXE.
The experiments conducted on the reverse mimicry dataset showed that the Random Committee ensemble learning model achieved 100% detection rates for embedded EXE and embedded JavaScript, and 98% detection rate for embedded PDF, based on our enhanced feature set.
Related Results
Design of Malicious Code Detection System Based on Binary Code Slicing
Design of Malicious Code Detection System Based on Binary Code Slicing
<p>Malicious code threatens the safety of computer systems. Researching malicious code design techniques and mastering code behavior patterns are the basic work of network se...
Construction of a Cybersecurity Behavior Knowledge Base for Malicious Behavior Analysis
Construction of a Cybersecurity Behavior Knowledge Base for Malicious Behavior Analysis
Facing the surge in malicious behaviors in the network environment, the existing cybersecurity knowledge graph suffers from fragmented security knowledge and limited application sc...
Localisation of Attacks, Combating Browser-Based Geo-Information and IP Tracking Attacks
Localisation of Attacks, Combating Browser-Based Geo-Information and IP Tracking Attacks
<p>Accessing and retrieving users’ browser and network information is a common practice used by advertisers and many online services to deliver targeted ads and explicit impr...
O cuidado e suas dimensões: uma revisão bibliográfica
O cuidado e suas dimensões: uma revisão bibliográfica
Introdução: a problemática central deste artigo é o cuidado. O cuidado não como expressão única do tecnicismo, mas em suas múltiplas dimensões. Objetivo: discutir o cuidado numa pe...
Learning-Based Detection for Malicious Android Application Using Code Vectorization
Learning-Based Detection for Malicious Android Application Using Code Vectorization
The malicious APK (Android Application Package) makers use some techniques such as code obfuscation and code encryption to avoid existing detection methods, which poses new challen...
January 2024 , Volume 22, Issue 1 Full pdf of issue Editorial Plea for Peace - Publisher World Family Medicine Health-Related Quality of Life (HRQoL) in Haemodialysis Patients in Khartoum, Sudan [Abstract] [pdf] Samira Khatir Ali Fadlalla DOI: 10.5742/
January 2024 , Volume 22, Issue 1 Full pdf of issue Editorial Plea for Peace - Publisher World Family Medicine Health-Related Quality of Life (HRQoL) in Haemodialysis Patients in Khartoum, Sudan [Abstract] [pdf] Samira Khatir Ali Fadlalla DOI: 10.5742/
While the bombing of Gaza and the resulting loss of civilians continues, I urge the international community to stop the war now, protect civilians (including health-care workers), ...
Status and solutions of malicious complaints
Status and solutions of malicious complaints
In Korea, malicious complaints that go beyond common sense are continuously occurring. Considering that Korea is a leading country in terms of security, the serious level of malici...
Defeating Evasive Malware with Peekaboo: Extracting Authentic Malware Behavior with Dynamic Binary Instrumentation
Defeating Evasive Malware with Peekaboo: Extracting Authentic Malware Behavior with Dynamic Binary Instrumentation
Abstract
The accuracy of Artificial Intelligence (AI) in malware detection is dependent on the features it is trained with, where the quality and authenticity of these feat...

