Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

AUTOMATING CYBER THREAT INTELLIGENCE EXTRACTION USING NATURAL LANGUAGE PROCESSING TECHNIQUES

View through CrossRef
The increasing negligence and complexity of online confrontations have made it abundantly clear that an organization must place a premium on real-time, ready-to-use, and expandable Cyber Threat Intelligence (CTI) strategies. The classical approach to CTI collection and analysis that heavily involves manual work over raw unstructured text-based data including threat reports, blogs, and advisories cannot keep up with the requirements of current cybersecurity threats. In this study, an intermediate form of Natural Language Processing (NLP) framework is introduced utilizing the state-of-the-art transformer models, namely fine-tuned versions of BERT architectures, and syntactic dependency parsing and domain-specific rule-based post-processing to automate CTI extraction. The dataset of more than 5,000 cybersecurity documents was created with a custom label that allows the system to extract the strongest threat entities such as names of malware, CVEs, IP addresses, threat actors, and TTPs. As experimental comparisons prove the proposed system vastly surpasses the existing BiLSTM-CRF and traditional CRF baselines scoring 0.90 F1-score in entity recognition. Error analysis also showed that syntactic and rule-based enhancements produced a big difference in entity fragmentation and false positives. The paper also investigates how preprocessing or data source quality and the process of entity links to external knowledge bases can aid in the optimal extraction of CTI. The findings demonstrate the promise of using advanced NLP methods to revolutionize CTI processes to perform more accurate, faster, and scalable threat intelligence processing to support proactive cybersecurity defense.
Title: AUTOMATING CYBER THREAT INTELLIGENCE EXTRACTION USING NATURAL LANGUAGE PROCESSING TECHNIQUES
Description:
The increasing negligence and complexity of online confrontations have made it abundantly clear that an organization must place a premium on real-time, ready-to-use, and expandable Cyber Threat Intelligence (CTI) strategies.
The classical approach to CTI collection and analysis that heavily involves manual work over raw unstructured text-based data including threat reports, blogs, and advisories cannot keep up with the requirements of current cybersecurity threats.
In this study, an intermediate form of Natural Language Processing (NLP) framework is introduced utilizing the state-of-the-art transformer models, namely fine-tuned versions of BERT architectures, and syntactic dependency parsing and domain-specific rule-based post-processing to automate CTI extraction.
The dataset of more than 5,000 cybersecurity documents was created with a custom label that allows the system to extract the strongest threat entities such as names of malware, CVEs, IP addresses, threat actors, and TTPs.
As experimental comparisons prove the proposed system vastly surpasses the existing BiLSTM-CRF and traditional CRF baselines scoring 0.
90 F1-score in entity recognition.
Error analysis also showed that syntactic and rule-based enhancements produced a big difference in entity fragmentation and false positives.
The paper also investigates how preprocessing or data source quality and the process of entity links to external knowledge bases can aid in the optimal extraction of CTI.
The findings demonstrate the promise of using advanced NLP methods to revolutionize CTI processes to perform more accurate, faster, and scalable threat intelligence processing to support proactive cybersecurity defense.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
An Empirical Study on Cyber Crimes Against Women and Children in India
An Empirical Study on Cyber Crimes Against Women and Children in India
The aim of the study is to understand the Cyber-crimes against women and Children in India for a period of five years from 2017 to 2021. The study is based on Secondary data collec...
Cyber operational risk scenarios for insurance companies
Cyber operational risk scenarios for insurance companies
Abstract Cyber Operational Risk: Cyber risk is routinely cited as one of the most important sources of operational risks facing organisations today, in various publications and ...
Cyber Espionage
Cyber Espionage
Cyberspace gives rise to risks as well as opportunities, and a prominent threat emerging from this domain is cyber espionage. Because no internationally and legally recognized defi...
ThreatBased Security Risk Evaluation in the Cloud
ThreatBased Security Risk Evaluation in the Cloud
Research ProblemCyber attacks are targeting the cloud computing systems, where enterprises, governments, and individuals are outsourcing their storage and computational resources f...
METHODS OF EXTRACTING CYBERSECURITY OBJECTS FROM ELECTRONIC SOURCES USING ARTIFICIAL INTELLIGENCE
METHODS OF EXTRACTING CYBERSECURITY OBJECTS FROM ELECTRONIC SOURCES USING ARTIFICIAL INTELLIGENCE
B a c k g r o u n d . The rapid development of information technology (IT) has led to new threats and challenges in the field of cybersecurity. Cyber warfare has become a reality a...
The challenges of cybersecurity insurance development: The case of Latvia
The challenges of cybersecurity insurance development: The case of Latvia
Purpose. This paper aims to provide an overview of the current challenges of cybersecurity insurance, focusing on the identification of development constraints and opportunities an...

Back to Top