Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Use of SNOMED CT in Large Language Models: Scoping Review (Preprint)

View through CrossRef
BACKGROUND Large language models (LLMs) have substantially advanced natural language processing (NLP) capabilities but often struggle with knowledge-driven tasks in specialized domains such as biomedicine. Integrating biomedical knowledge sources such as SNOMED CT into LLMs may enhance their performance on biomedical tasks. However, the methodologies and effectiveness of incorporating SNOMED CT into LLMs have not been systematically reviewed. OBJECTIVE This scoping review aims to examine how SNOMED CT is integrated into LLMs, focusing on (1) the types and components of LLMs being integrated with SNOMED CT, (2) which contents of SNOMED CT are being integrated, and (3) whether this integration improves LLM performance on NLP tasks. METHODS Following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we searched ACM Digital Library, ACL Anthology, IEEE Xplore, PubMed, and Embase for relevant studies published from 2018 to 2023. Studies were included if they incorporated SNOMED CT into LLM pipelines for natural language understanding or generation tasks. Data on LLM types, SNOMED CT integration methods, end tasks, and performance metrics were extracted and synthesized. RESULTS The review included 37 studies. Bidirectional Encoder Representations from Transformers and its biomedical variants were the most commonly used LLMs. Three main approaches for integrating SNOMED CT were identified: (1) incorporating SNOMED CT into LLM inputs (28/37, 76%), primarily using concept descriptions to expand training corpora; (2) integrating SNOMED CT into additional fusion modules (5/37, 14%); and (3) using SNOMED CT as an external knowledge retriever during inference (5/37, 14%). The most frequent end task was medical concept normalization (15/37, 41%), followed by entity extraction or typing and classification. While most studies (17/19, 89%) reported performance improvements after SNOMED CT integration, only a small fraction (19/37, 51%) provided direct comparisons. The reported gains varied widely across different metrics and tasks, ranging from 0.87% to 131.66%. However, some studies showed either no improvement or a decline in certain performance metrics. CONCLUSIONS This review demonstrates diverse approaches for integrating SNOMED CT into LLMs, with a focus on using concept descriptions to enhance biomedical language understanding and generation. While the results suggest potential benefits of SNOMED CT integration, the lack of standardized evaluation methods and comprehensive performance reporting hinders definitive conclusions about its effectiveness. Future research should prioritize consistent reporting of performance comparisons and explore more sophisticated methods for incorporating SNOMED CT’s relational structure into LLMs. In addition, the biomedical NLP community should develop standardized evaluation frameworks to better assess the impact of ontology integration on LLM performance.
JMIR Publications Inc.
Title: Use of SNOMED CT in Large Language Models: Scoping Review (Preprint)
Description:
BACKGROUND Large language models (LLMs) have substantially advanced natural language processing (NLP) capabilities but often struggle with knowledge-driven tasks in specialized domains such as biomedicine.
Integrating biomedical knowledge sources such as SNOMED CT into LLMs may enhance their performance on biomedical tasks.
However, the methodologies and effectiveness of incorporating SNOMED CT into LLMs have not been systematically reviewed.
OBJECTIVE This scoping review aims to examine how SNOMED CT is integrated into LLMs, focusing on (1) the types and components of LLMs being integrated with SNOMED CT, (2) which contents of SNOMED CT are being integrated, and (3) whether this integration improves LLM performance on NLP tasks.
METHODS Following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we searched ACM Digital Library, ACL Anthology, IEEE Xplore, PubMed, and Embase for relevant studies published from 2018 to 2023.
Studies were included if they incorporated SNOMED CT into LLM pipelines for natural language understanding or generation tasks.
Data on LLM types, SNOMED CT integration methods, end tasks, and performance metrics were extracted and synthesized.
RESULTS The review included 37 studies.
Bidirectional Encoder Representations from Transformers and its biomedical variants were the most commonly used LLMs.
Three main approaches for integrating SNOMED CT were identified: (1) incorporating SNOMED CT into LLM inputs (28/37, 76%), primarily using concept descriptions to expand training corpora; (2) integrating SNOMED CT into additional fusion modules (5/37, 14%); and (3) using SNOMED CT as an external knowledge retriever during inference (5/37, 14%).
The most frequent end task was medical concept normalization (15/37, 41%), followed by entity extraction or typing and classification.
While most studies (17/19, 89%) reported performance improvements after SNOMED CT integration, only a small fraction (19/37, 51%) provided direct comparisons.
The reported gains varied widely across different metrics and tasks, ranging from 0.
87% to 131.
66%.
However, some studies showed either no improvement or a decline in certain performance metrics.
CONCLUSIONS This review demonstrates diverse approaches for integrating SNOMED CT into LLMs, with a focus on using concept descriptions to enhance biomedical language understanding and generation.
While the results suggest potential benefits of SNOMED CT integration, the lack of standardized evaluation methods and comprehensive performance reporting hinders definitive conclusions about its effectiveness.
Future research should prioritize consistent reporting of performance comparisons and explore more sophisticated methods for incorporating SNOMED CT’s relational structure into LLMs.
In addition, the biomedical NLP community should develop standardized evaluation frameworks to better assess the impact of ontology integration on LLM performance.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
SNOMED CT in Pathology
SNOMED CT in Pathology
Pathology information systems have been using SNOMED II for many years, and in most cases, they are in a migration process to SNOMED CT. COST Action IC0604 (EURO-TELEPATH) has cons...
Does SNOMED CT post-coordination scale?
Does SNOMED CT post-coordination scale?
SNOMED CT is a compositional terminology. Construction of post-coordinated expressions allows users to specify new meaning by referencing existing SNOMED CT concepts. The use of po...
Intégration de connaissances biomédicales hétérogènes grâce à un modèle basé sur les ontologies de support
Intégration de connaissances biomédicales hétérogènes grâce à un modèle basé sur les ontologies de support
Dans le domaine de la santé, il existe un nombre très important de sources de connaissances, qui vont de simples terminologies, classifications et vocabulaires contrôlés à des repr...
Well-being focused interventions for caregivers of children with developmental disabilities-a scoping review protocol
Well-being focused interventions for caregivers of children with developmental disabilities-a scoping review protocol
AbstractIntroductionChildren with developmental disabilities (DD) have complex health needs which imply that they will need assistance in many areas of their lives, a role usually ...

Back to Top