Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

HDI Corpus: A Dataset for Named Entity Recognition for In-Context Herb-Drug Interactions

View through CrossRef
Introduction This article proposes a new dataset for Named Entity Recognition based on PubMed articles and aiming to address the problem of Herb-Drug Interactions. It aims to offer a new dataset for recognizing herb-drug interaction entities, including contextual information. Background Machine learning and Deep learning provide users with powerful tools for task automation, but require large quantities of data to perform well. In the field of Natural Language Processing, training Deep Learning models requires the annotation of large corpora of text. While some corpora exist in medical literature, each specific task requires an adapted corpus. Methods The dataset was tested using a classical Named Entity Recognition pipeline, as well as new possibilities offered by generative AI. Results The dataset proposes annotated sentences of around a hundred articles and covers 15 entities, including herbs, drugs, and pathologies, as well as contextual information, such as cohort composition, patient information, or pharmacological clues. Discussion The study demonstrates that this dataset performs comparably to the DDI (Drug-Drug Interaction) corpus — a standard dataset in the drug Named Entity Recognition — for drug recognition, and performs well on most of the entities. Conclusion : We believe this corpus could help diversify pharmacological Named Entity Recognition.
Title: HDI Corpus: A Dataset for Named Entity Recognition for In-Context Herb-Drug Interactions
Description:
Introduction This article proposes a new dataset for Named Entity Recognition based on PubMed articles and aiming to address the problem of Herb-Drug Interactions.
It aims to offer a new dataset for recognizing herb-drug interaction entities, including contextual information.
Background Machine learning and Deep learning provide users with powerful tools for task automation, but require large quantities of data to perform well.
In the field of Natural Language Processing, training Deep Learning models requires the annotation of large corpora of text.
While some corpora exist in medical literature, each specific task requires an adapted corpus.
Methods The dataset was tested using a classical Named Entity Recognition pipeline, as well as new possibilities offered by generative AI.
Results The dataset proposes annotated sentences of around a hundred articles and covers 15 entities, including herbs, drugs, and pathologies, as well as contextual information, such as cohort composition, patient information, or pharmacological clues.
Discussion The study demonstrates that this dataset performs comparably to the DDI (Drug-Drug Interaction) corpus — a standard dataset in the drug Named Entity Recognition — for drug recognition, and performs well on most of the entities.
Conclusion : We believe this corpus could help diversify pharmacological Named Entity Recognition.

Related Results

Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Statistical analyses on the correlation of corruption perception index and some other indices in Nigeria
Statistical analyses on the correlation of corruption perception index and some other indices in Nigeria
This study investigated the statistical analysis of Corruption Perception Index (CPI) in Nigeria considering some other indices which are, Human Development Index (HDI), Global Pea...
Žanrovska analiza pomorskopravnih tekstova i ostvarenje prijevodnih univerzalija u njihovim prijevodima s engleskoga jezika
Žanrovska analiza pomorskopravnih tekstova i ostvarenje prijevodnih univerzalija u njihovim prijevodima s engleskoga jezika
Genre implies formal and stylistic conventions of a particular text type, which inevitably affects the translation process. This „force of genre bias“ (Prieto Ramos, 2014) has been...
Pharmacokinetics of herb-drug interactions: Experimental models in Nigeria
Pharmacokinetics of herb-drug interactions: Experimental models in Nigeria
<p style="text-align: justify;">Herbs have been a vital renewable source of medicine throughout human history as a large proportion of the global po...
Perioperative and anesthesia-related cardiac arrests in geriatric patients: a systematic review using meta-regression analysis
Perioperative and anesthesia-related cardiac arrests in geriatric patients: a systematic review using meta-regression analysis
AbstractThe worldwide population is aging, and the number of surgeries performed in geriatric patients is increasing. This systematic review evaluated anesthetic procedures to asse...

Back to Top