Javascript must be enabled to continue!

Classifiers of Medical Eponymy in Scientific Texts

Many concepts in the medical literature are named after persons. Frequent ambiguities and spelling varieties, however, complicate the automatic recognition of such eponyms with natural language processing (NLP) tools. Recently developed methods include word vectors and transformer models that incorporate context information into the downstream layers of a neural network architecture. To evaluate these models for classifying medical eponymy, we label eponyms and counterexamples mentioned in a convenience sample of 1,079 Pubmed abstracts, and fit logistic regression models to the vectors from the first (vocabulary) and last (contextualized) layers of a SciBERT language model. According to the area under sensitivity-specificity curves, models based on contextualized vectors achieved a median performance of 98.0% in held-out phrases. This outperformed models based on vocabulary vectors (95.7%) by a median of 2.3 percentage points. When processing unlabeled inputs, such classifiers appeared to generalize to eponyms that did not appear among any annotations. These findings attest to the effectiveness of developing domain-specific NLP functions based on pre-trained language models, and underline the utility of context information for classifying potential eponyms.

IOS Press

Dennis Toddenroth

Studies in Health Technology and Informatics

2023

Title: Classifiers of Medical Eponymy in Scientific Texts

Description:

Many concepts in the medical literature are named after persons.

Frequent ambiguities and spelling varieties, however, complicate the automatic recognition of such eponyms with natural language processing (NLP) tools.

Recently developed methods include word vectors and transformer models that incorporate context information into the downstream layers of a neural network architecture.

To evaluate these models for classifying medical eponymy, we label eponyms and counterexamples mentioned in a convenience sample of 1,079 Pubmed abstracts, and fit logistic regression models to the vectors from the first (vocabulary) and last (contextualized) layers of a SciBERT language model.

According to the area under sensitivity-specificity curves, models based on contextualized vectors achieved a median performance of 98.

0% in held-out phrases.

This outperformed models based on vocabulary vectors (95.

7%) by a median of 2.

3 percentage points.

When processing unlabeled inputs, such classifiers appeared to generalize to eponyms that did not appear among any annotations.

These findings attest to the effectiveness of developing domain-specific NLP functions based on pre-trained language models, and underline the utility of context information for classifying potential eponyms.

Back

Genre implies formal and stylistic conventions of a particular text type, which inevitably affects the translation process. This „force of genre bias“ (Prieto Ramos, 2014) has been...

Biblical Texts and Interpretations in the Dead Sea Scrolls: Biblical Texts

The introduction to this entry places the Dead Sea Scrolls in their historical and chronological context and discusses the popularity and provenance of the texts found in the Judea...

Machine Learning and Semantic Orientation Ensemble Methods for Egyptian Telecom Tweets Sentiment Analysis

The vast amount of data currently available online attracted many parties to analyze sentiments expressed in these data extracting valuable knowledge. Many approaches have been pro...

Medical tourism and healthcare trends in Thailand

Medical tourism can be defined as the travel of patients from one country to another with the intention of receiving medical treatment. This is an increasing and important feature ...

Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study

Abstract Introduction The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...

The pneumonia severity index: assessment and comparison to popular machine learning classifiers

AbstractPneumonia is the top communicable cause of death worldwide. Accurate prognostication of patient severity with Community Acquired Pneumonia (CAP) allows better patient care ...

Learning Prototype Classifiers for Long-Tailed Recognition

The problem of long-tailed recognition (LTR) has received attention in recent years due to the fundamental power-law distribution of objects in the real-world. Most recent works in...

Tibetan Fond of the Center of Oriental Manuscripts and Xylography of the Institute for Mongolian, Buddhist and Tibetan Studies of the Siberian Branch of the Russian Academy of Sciences: Characteristics, Classification of the Medical Collection

This article offers a description and subject classification of the medical texts collection from the Tibetan fond of the Center for Oriental Manuscripts and Xylographs of the Inst...

Email:
Password:

Email:

Classifiers of Medical Eponymy in Scientific Texts

Related Results