Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Diagnostic Performance of Claude 3 from Patient History and Key Images in Diagnosis Please Cases

View through CrossRef
AbstractBackgroundsLarge language artificial intelligence models have showed its diagnostic performance based solely on textual information from clinical history and imaging findings. However, the extent of their performance when utilizing radiological images and providing differential diagnoses has yet to be investigated.PurposeWe employed the latest version of Claude 3, Opus, released on March 4, 2024, to investigate its diagnostic performance in answering Radiology’s Diagnosis Please quiz questions under three conditions: (1) when provided with clinical history alone; (2) when given clinical history along with imaging findings; and (3) when supplied with clinical history and key images.Furthermore, we evaluated the diagnostic performance of the model when instructed to list differential diagnoses.Materials and MethodsClaude 3 Opus was tasked with listing the primary diagnosis and two differential diagnoses for 322 quiz questions from Radiology’s “Diagnosis Please” cases, which included cases 1 to 322, published from 1998 to 2023. The analyses were carried out under the following input conditions:Condition 1: Submitter-provided clinical history (text) aloneCondition 2: Submitter-provided clinical history and imaging findings (text) Condition 3: Submitter-provided clinical history (text) and key images (PDF files)We applied McNemar’s tests to evaluate differences in correct response rates for primary diagnoses across Conditions 1, 2, and 3.ResultsThe correct primary diagnoses rates were 62/322 (19.3%), 178/322 (55.3%), and 93/322 (28.8%) for Conditions 1, 2, and 3, respectively. Additionally, Claude 3 Opus accurately provided the correct answer as a differential diagnosis in up to 22/322 (6.8%) of cases. There were statistically significant differences in correct response rates for primary diagnoses between all combinations of Conditions 1, 2, and 3 (p<0.001).ConclusionClaude 3 Opus demonstrated significantly improved diagnostic performance by inputting key images in addition to clinical history. The ability to list important differential diagnoses was also confirmed.Key ResultsThis study investigated Claude 3 Opus’s performance in Radiology Diagnosis Please Cases using clinical history, key images, and imaging findings.Key images or imaging findings inputs significantly improved correct primary diagnoses from 19.3% to 28.8% or 55.5%, respectively.By having two additional differential diagnoses presented, total correct responses improved by 3.1–6.8%.Summary statementLarge language AI model Claude 3 Opus demonstrated significantly improved diagnostic accuracy by adding key images with clinical history compared with clinical history alone.
Title: Diagnostic Performance of Claude 3 from Patient History and Key Images in Diagnosis Please Cases
Description:
AbstractBackgroundsLarge language artificial intelligence models have showed its diagnostic performance based solely on textual information from clinical history and imaging findings.
However, the extent of their performance when utilizing radiological images and providing differential diagnoses has yet to be investigated.
PurposeWe employed the latest version of Claude 3, Opus, released on March 4, 2024, to investigate its diagnostic performance in answering Radiology’s Diagnosis Please quiz questions under three conditions: (1) when provided with clinical history alone; (2) when given clinical history along with imaging findings; and (3) when supplied with clinical history and key images.
Furthermore, we evaluated the diagnostic performance of the model when instructed to list differential diagnoses.
Materials and MethodsClaude 3 Opus was tasked with listing the primary diagnosis and two differential diagnoses for 322 quiz questions from Radiology’s “Diagnosis Please” cases, which included cases 1 to 322, published from 1998 to 2023.
The analyses were carried out under the following input conditions:Condition 1: Submitter-provided clinical history (text) aloneCondition 2: Submitter-provided clinical history and imaging findings (text) Condition 3: Submitter-provided clinical history (text) and key images (PDF files)We applied McNemar’s tests to evaluate differences in correct response rates for primary diagnoses across Conditions 1, 2, and 3.
ResultsThe correct primary diagnoses rates were 62/322 (19.
3%), 178/322 (55.
3%), and 93/322 (28.
8%) for Conditions 1, 2, and 3, respectively.
Additionally, Claude 3 Opus accurately provided the correct answer as a differential diagnosis in up to 22/322 (6.
8%) of cases.
There were statistically significant differences in correct response rates for primary diagnoses between all combinations of Conditions 1, 2, and 3 (p<0.
001).
ConclusionClaude 3 Opus demonstrated significantly improved diagnostic performance by inputting key images in addition to clinical history.
The ability to list important differential diagnoses was also confirmed.
Key ResultsThis study investigated Claude 3 Opus’s performance in Radiology Diagnosis Please Cases using clinical history, key images, and imaging findings.
Key images or imaging findings inputs significantly improved correct primary diagnoses from 19.
3% to 28.
8% or 55.
5%, respectively.
By having two additional differential diagnoses presented, total correct responses improved by 3.
1–6.
8%.
Summary statementLarge language AI model Claude 3 Opus demonstrated significantly improved diagnostic accuracy by adding key images with clinical history compared with clinical history alone.

Related Results

Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract Introduction The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Hydatid Disease of The Brain Parenchyma: A Systematic Review
Abstarct Introduction Isolated brain hydatid disease (BHD) is an extremely rare form of echinococcosis. A prompt and timely diagnosis is a crucial step in disease management. This ...
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Abstract Thoracic outlet syndrome (TOS) is a complex and often overlooked condition caused by the compression of neurovascular structures as they pass through the thoracic outlet. ...
Primary Thyroid Non-Hodgkin B-Cell Lymphoma: A Case Series
Primary Thyroid Non-Hodgkin B-Cell Lymphoma: A Case Series
Abstract Introduction Non-Hodgkin lymphoma (NHL) of the thyroid, a rare malignancy linked to autoimmune disorders, is poorly understood in terms of its pathogenesis and treatment o...
Breast Carcinoma within Fibroadenoma: A Systematic Review
Breast Carcinoma within Fibroadenoma: A Systematic Review
Abstract Introduction Fibroadenoma is the most common benign breast lesion; however, it carries a potential risk of malignant transformation. This systematic review provides an ove...
Chest Wall Hydatid Cysts: A Systematic Review
Chest Wall Hydatid Cysts: A Systematic Review
Abstract Introduction Given the rarity of chest wall hydatid disease, information on this condition is primarily drawn from case reports. Hence, this study systematically reviews t...
Microwave Ablation with or Without Chemotherapy in Management of Non-Small Cell Lung Cancer: A Systematic Review
Microwave Ablation with or Without Chemotherapy in Management of Non-Small Cell Lung Cancer: A Systematic Review
Abstract Introduction  Microwave ablation (MWA) has emerged as a minimally invasive treatment for patients with inoperable non-small cell lung cancer (NSCLC). However, whether it i...
Clinicopathological Features of Indeterminate Thyroid Nodules: A Single-center Cross-sectional Study
Clinicopathological Features of Indeterminate Thyroid Nodules: A Single-center Cross-sectional Study
Abstract Introduction Due to indeterminate cytology, Bethesda III is the most controversial category within the Bethesda System for Reporting Thyroid Cytopathology. This study exam...

Back to Top