Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Ontology Based Text Classifier for Information Extraction from Coronavirus Literature

View through CrossRef
The world is fighting an unprecedented coronavirus pandemic, and no country was prepared for it. Understanding the nature of this disease, when there is no available cure, is vital to encourage accurate clinical diagnosis and drug discovery prospects. When the amount of literature available is vast, it is important to represent the disease domain as completely as possible. The system should capture the morphology, semantics, syntax, and pragmatics of the given literature, in order to extract useful information. Also, building a classifier for a particular domain suffers from a zero frequency issue. To solve this effectively, latent topics are extracted and semantically represented in ontology to build a text classifier for coronavirus literature. The classifier is equipped with 2 components- ‘ontology’ and ‘machine learning data model’. Ontology helps to model the morphology and the semantic and pragmatic aspects of the text data through Latent Drichlet Allocation (LDA). It also preserves the contextual information in the document space, providing holistic feature representation facilities. To solve zero frequency and to extract actionable insights, a machine learning algorithm, Multi class Support Vector Machine (M-SVM), is incorporated with the ontology. It encodes features and achieves a classifier with highly discriminated classes. Further, to preserve contextual information space, and to enable data model formulation, the ontology is generated as a knowledge graph with their respective predefined classes. The resulting dataset can be used for clinical diagnosis and further research on the disease. Experimental results have shown that the proposed classifier outperforms the existing systems, with better domain representation. HIGHLIGHTS When the amount of literature available is vast, it is important to represent the disease domain as completely as possible. The system should capture the morphology, semantics, syntax, and pragmatics of the given literature, in order to extract useful information The classifier is equipped with 2 components- ‘ontology’ and ‘machine learning data model’. Ontology helps to model the morphology and the semantic and pragmatic aspects of the text data through Latent Drichlet Allocation (LDA). It also preserves the contextual information in the document space, providing holistic feature representation facilities To preserve contextual information space, and to enable data model formulation, the ontology is generated as a knowledge graph with their respective predefined classes. The resulting dataset can be used for clinical diagnosis and further research on the disease GRAPHICAL ABSTRACT
College of Graduate Studies, Walailak University
Title: Ontology Based Text Classifier for Information Extraction from Coronavirus Literature
Description:
The world is fighting an unprecedented coronavirus pandemic, and no country was prepared for it.
Understanding the nature of this disease, when there is no available cure, is vital to encourage accurate clinical diagnosis and drug discovery prospects.
When the amount of literature available is vast, it is important to represent the disease domain as completely as possible.
The system should capture the morphology, semantics, syntax, and pragmatics of the given literature, in order to extract useful information.
Also, building a classifier for a particular domain suffers from a zero frequency issue.
To solve this effectively, latent topics are extracted and semantically represented in ontology to build a text classifier for coronavirus literature.
The classifier is equipped with 2 components- ‘ontology’ and ‘machine learning data model’.
Ontology helps to model the morphology and the semantic and pragmatic aspects of the text data through Latent Drichlet Allocation (LDA).
It also preserves the contextual information in the document space, providing holistic feature representation facilities.
To solve zero frequency and to extract actionable insights, a machine learning algorithm, Multi class Support Vector Machine (M-SVM), is incorporated with the ontology.
It encodes features and achieves a classifier with highly discriminated classes.
Further, to preserve contextual information space, and to enable data model formulation, the ontology is generated as a knowledge graph with their respective predefined classes.
The resulting dataset can be used for clinical diagnosis and further research on the disease.
Experimental results have shown that the proposed classifier outperforms the existing systems, with better domain representation.
HIGHLIGHTS When the amount of literature available is vast, it is important to represent the disease domain as completely as possible.
The system should capture the morphology, semantics, syntax, and pragmatics of the given literature, in order to extract useful information The classifier is equipped with 2 components- ‘ontology’ and ‘machine learning data model’.
Ontology helps to model the morphology and the semantic and pragmatic aspects of the text data through Latent Drichlet Allocation (LDA).
It also preserves the contextual information in the document space, providing holistic feature representation facilities To preserve contextual information space, and to enable data model formulation, the ontology is generated as a knowledge graph with their respective predefined classes.
The resulting dataset can be used for clinical diagnosis and further research on the disease GRAPHICAL ABSTRACT.

Related Results

Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
Utilizing Large Language Models for Geoscience Literature Information Extraction
Utilizing Large Language Models for Geoscience Literature Information Extraction
Extracting information from unstructured and semi-structured geoscience literature is a crucial step in conducting geological research. The traditional machine learning extraction ...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
Extracting and Merging Contextualized Ontology Modules
Extracting and Merging Contextualized Ontology Modules
Ontology module extraction, from a large ontology, leads to the generation of a specialized knowledge model that is pertinent to specific problems. Existing ontology module extract...
When is R[θ] integrally closed?
When is R[θ] integrally closed?
Let [Formula: see text] be an integrally closed domain with quotient field [Formula: see text] and [Formula: see text] be an element of an integral domain containing [Formula: see ...
Inductive graph invariants and approximation algorithms
Inductive graph invariants and approximation algorithms
We introduce and study an inductively defined analogue [Formula: see text] of any increasing graph invariant [Formula: see text]. An invariant [Formula: see text] is increasing if ...

Back to Top