Javascript must be enabled to continue!

EMBER-Based Static Malware Detection: A Critical Review of Accuracy, Explainability, and Temporal Robustness Trade-offs

Static analysis-based malware detection of Portable Executable (PE) files has evolved remarkably since the release of the EMBER dataset in 2018. Yet evaluation methodology and model explainability continue to suffer from critical challenges that limit real-world implementation. This literature review explores five thematic clusters: Traditional Machine Learning, Deep Learning, Ensemble and Hybrid Architectures, Explainable AI (XAI), and Zero-day Detection with Concept Drift. The review analyzes 27 primary studies and 15 supporting references for a total of 42 studies published between 2018 and early 2026. The review shows that gradient boosted decision tree steadily offers better baseline performance. In comparison, ensemble and hybrid architectures show the highest accuracy overall. That being said, this comes with the cost of reduced explainability and an increase in computational overhead. Deep Learning methods make the performance gap thinner but bring up transparency and resource concerns. And the emerging Large Language Model (LLM)-based approaches remaining premature and unverified. Across all of the five clusters, six intersecting gaps are identified, the most notable being the near-universal dependence on random instead of temporal train/test splits. Other gaps include the lack of sufficient false positive rate reporting at operational thresholds, and the consistent separation between explainability and detection performance. Critically, no reviewed study achieved a successful integration of ensemble level accuracy, embedded explainability, and temporal oriented evaluation within a single framework. It’s a gap that this review specifically recognizes and highlights as the most crucial priority of the research in this field. The gaps explored can be addressed with the seven future research directions presented later in this review. The most critical one of them is the incorporation of ensemble accuracy, explainability and temporal evaluation in a unified framework. This is a combination that no reviewed study has achieved yet.

Al-Farabi University College

Ahmed M. Redha Abdulsattar Riyadh Rahef Nuiaa Alogaili Ahmed Raad Al-Sudani Selvakumar Manickam

Journal of Al-Farabi for Engineering Sciences

2026

Title: EMBER-Based Static Malware Detection: A Critical Review of Accuracy, Explainability, and Temporal Robustness Trade-offs

Description:

Static analysis-based malware detection of Portable Executable (PE) files has evolved remarkably since the release of the EMBER dataset in 2018.

Yet evaluation methodology and model explainability continue to suffer from critical challenges that limit real-world implementation.

This literature review explores five thematic clusters: Traditional Machine Learning, Deep Learning, Ensemble and Hybrid Architectures, Explainable AI (XAI), and Zero-day Detection with Concept Drift.

The review analyzes 27 primary studies and 15 supporting references for a total of 42 studies published between 2018 and early 2026.

The review shows that gradient boosted decision tree steadily offers better baseline performance.

In comparison, ensemble and hybrid architectures show the highest accuracy overall.

That being said, this comes with the cost of reduced explainability and an increase in computational overhead.

Deep Learning methods make the performance gap thinner but bring up transparency and resource concerns.

And the emerging Large Language Model (LLM)-based approaches remaining premature and unverified.

Across all of the five clusters, six intersecting gaps are identified, the most notable being the near-universal dependence on random instead of temporal train/test splits.

Other gaps include the lack of sufficient false positive rate reporting at operational thresholds, and the consistent separation between explainability and detection performance.

Critically, no reviewed study achieved a successful integration of ensemble level accuracy, embedded explainability, and temporal oriented evaluation within a single framework.

It’s a gap that this review specifically recognizes and highlights as the most crucial priority of the research in this field.

The gaps explored can be addressed with the seven future research directions presented later in this review.

The most critical one of them is the incorporation of ensemble accuracy, explainability and temporal evaluation in a unified framework.

This is a combination that no reviewed study has achieved yet.

Back

The need to mitigate malware attacks cannot be overemphasized, as they pose serious threats to the critical information assets in cyberspace. Understanding and utilizing appropriat...

MCPDS: image-based malware classification method using PE metadata alone

Abstract In response to the increasing threat posed by the exponential growth of malware in cybersecurity, researchers have developed a numbe...

Android Malware Detection Techniques: A Literature Review

Objective: This paper provides the basics of Android malware, its evolution and tools and techniques for malware analysis. Its main aim is to present a review of the literature on ...

An optimal deep learning-based framework for the detection and classification of android malware

The use of smartphones is increasing rapidly and the malicious intrusions associated with it have become a challenging task that needs to be resolved. A secure and effective techn...

Malware and Windows APIs: A Dangerous Duo

This paper introduces its interaction with malware and Windows APIs (application programming interface). The first section describes malware and investigates various types such as ...

AndroDex: Android Dex Images of Obfuscated Malware

AbstractWith the emergence of technology and the usage of a large number of smart devices, cyber threats are increasing. Therefore, research studies have shifted their attention to...

Malware Detection using Deep Learning

Malicious software or malware continues to pose a major security concern in this digital age as computer users, corporations, and governments witness an exponential growth in malwa...

ACMFNN: Design of an augmented convolutional model for intelligent cross-domain malware localization via forensic neural networks

Abstract Classification of malwares from spatial & temporal data patterns requires efficient design of deep learning models. These models deploy methods for feature ext...

Email:
Password:

Email:

EMBER-Based Static Malware Detection: A Critical Review of Accuracy, Explainability, and Temporal Robustness Trade-offs

Related Results