Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Performance of ChatGPT in Ophthalmic Registration and Clinical Diagnosis: Cross-Sectional Study (Preprint)

View through CrossRef
BACKGROUND Artificial intelligence (AI) chatbots such as ChatGPT are expected to impact vision health care significantly. Their potential to optimize the consultation process and diagnostic capabilities across range of ophthalmic subspecialties have yet to be fully explored. OBJECTIVE This study aims to investigate the performance of AI chatbots in recommending ophthalmic outpatient registration and diagnosing eye diseases within clinical case profiles. METHODS This cross-sectional study used clinical cases from <i>Chinese Standardized Resident Training–Ophthalmology (2nd Edition)</i>. For each case, 2 profiles were created: patient with history (Hx) and patient with history and examination (Hx+Ex). These profiles served as independent queries for GPT-3.5 and GPT-4.0 (accessed from March 5 to 18, 2024). Similarly, 3 ophthalmic residents were posed the same profiles in a questionnaire format. The accuracy of recommending ophthalmic subspecialty registration was primarily evaluated using Hx profiles. The accuracy of the top-ranked diagnosis and the accuracy of the diagnosis within the top 3 suggestions (do-not-miss diagnosis) were assessed using Hx+Ex profiles. The gold standard for judgment was the published, official diagnosis. Characteristics of incorrect diagnoses by ChatGPT were also analyzed. RESULTS A total of 208 clinical profiles from 12 ophthalmic subspecialties were analyzed (104 Hx and 104 Hx+Ex profiles). For Hx profiles, GPT-3.5, GPT-4.0, and residents showed comparable accuracy in registration suggestions (66/104, 63.5%; 81/104, 77.9%; and 72/104, 69.2%, respectively; <i>P</i>=.07), with <i>ocular trauma</i>, <i>retinal diseases</i>, and <i>strabismus and amblyopia</i> achieving the top 3 accuracies. For Hx+Ex profiles, both GPT-4.0 and residents demonstrated higher diagnostic accuracy than GPT-3.5 (62/104, 59.6% and 63/104, 60.6% vs 41/104, 39.4%; <i>P</i>=.003 and <i>P</i>=.001, respectively). Accuracy for do-not-miss diagnoses also improved (79/104, 76% and 68/104, 65.4% vs 51/104, 49%; <i>P</i>&lt;.001 and <i>P</i>=.02, respectively). The highest diagnostic accuracies were observed in <i>glaucoma</i>; <i>lens diseases</i>; and <i>eyelid, lacrimal, and orbital diseases</i>. GPT-4.0 recorded fewer incorrect top-3 diagnoses (25/42, 60% vs 53/63, 84%; <i>P</i>=.005) and more partially correct diagnoses (21/42, 50% vs 7/63 11%; <i>P</i>&lt;.001) than GPT-3.5, while GPT-3.5 had more completely incorrect (27/63, 43% vs 7/42, 17%; <i>P</i>=.005) and less precise diagnoses (22/63, 35% vs 5/42, 12%; <i>P</i>=.009). CONCLUSIONS GPT-3.5 and GPT-4.0 showed intermediate performance in recommending ophthalmic subspecialties for registration. While GPT-3.5 underperformed, GPT-4.0 approached and numerically surpassed residents in differential diagnosis. AI chatbots show promise in facilitating ophthalmic patient registration. However, their integration into diagnostic decision-making requires more validation.
Title: Performance of ChatGPT in Ophthalmic Registration and Clinical Diagnosis: Cross-Sectional Study (Preprint)
Description:
BACKGROUND Artificial intelligence (AI) chatbots such as ChatGPT are expected to impact vision health care significantly.
Their potential to optimize the consultation process and diagnostic capabilities across range of ophthalmic subspecialties have yet to be fully explored.
OBJECTIVE This study aims to investigate the performance of AI chatbots in recommending ophthalmic outpatient registration and diagnosing eye diseases within clinical case profiles.
METHODS This cross-sectional study used clinical cases from <i>Chinese Standardized Resident Training–Ophthalmology (2nd Edition)</i>.
For each case, 2 profiles were created: patient with history (Hx) and patient with history and examination (Hx+Ex).
These profiles served as independent queries for GPT-3.
5 and GPT-4.
0 (accessed from March 5 to 18, 2024).
Similarly, 3 ophthalmic residents were posed the same profiles in a questionnaire format.
The accuracy of recommending ophthalmic subspecialty registration was primarily evaluated using Hx profiles.
The accuracy of the top-ranked diagnosis and the accuracy of the diagnosis within the top 3 suggestions (do-not-miss diagnosis) were assessed using Hx+Ex profiles.
The gold standard for judgment was the published, official diagnosis.
Characteristics of incorrect diagnoses by ChatGPT were also analyzed.
RESULTS A total of 208 clinical profiles from 12 ophthalmic subspecialties were analyzed (104 Hx and 104 Hx+Ex profiles).
For Hx profiles, GPT-3.
5, GPT-4.
0, and residents showed comparable accuracy in registration suggestions (66/104, 63.
5%; 81/104, 77.
9%; and 72/104, 69.
2%, respectively; <i>P</i>=.
07), with <i>ocular trauma</i>, <i>retinal diseases</i>, and <i>strabismus and amblyopia</i> achieving the top 3 accuracies.
For Hx+Ex profiles, both GPT-4.
0 and residents demonstrated higher diagnostic accuracy than GPT-3.
5 (62/104, 59.
6% and 63/104, 60.
6% vs 41/104, 39.
4%; <i>P</i>=.
003 and <i>P</i>=.
001, respectively).
Accuracy for do-not-miss diagnoses also improved (79/104, 76% and 68/104, 65.
4% vs 51/104, 49%; <i>P</i>&lt;.
001 and <i>P</i>=.
02, respectively).
The highest diagnostic accuracies were observed in <i>glaucoma</i>; <i>lens diseases</i>; and <i>eyelid, lacrimal, and orbital diseases</i>.
GPT-4.
0 recorded fewer incorrect top-3 diagnoses (25/42, 60% vs 53/63, 84%; <i>P</i>=.
005) and more partially correct diagnoses (21/42, 50% vs 7/63 11%; <i>P</i>&lt;.
001) than GPT-3.
5, while GPT-3.
5 had more completely incorrect (27/63, 43% vs 7/42, 17%; <i>P</i>=.
005) and less precise diagnoses (22/63, 35% vs 5/42, 12%; <i>P</i>=.
009).
CONCLUSIONS GPT-3.
5 and GPT-4.
0 showed intermediate performance in recommending ophthalmic subspecialties for registration.
While GPT-3.
5 underperformed, GPT-4.
0 approached and numerically surpassed residents in differential diagnosis.
AI chatbots show promise in facilitating ophthalmic patient registration.
However, their integration into diagnostic decision-making requires more validation.

Related Results

Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract Introduction The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...
Assessment of Chat-GPT, Gemini, and Perplexity in Principle of Research Publication: A Comparative Study
Assessment of Chat-GPT, Gemini, and Perplexity in Principle of Research Publication: A Comparative Study
Abstract Introduction Many researchers utilize artificial intelligence (AI) to aid their research endeavors. This study seeks to assess and contrast the performance of three sophis...
ChatGPT's Capabilities for Use in Anatomy Education and Anatomy Research
ChatGPT's Capabilities for Use in Anatomy Education and Anatomy Research
Dear Editors, Recently, the discussion of an artificial intelligence (AI) - fueled platform in several articles in your journal has attracted the attention of many researchers [1, ...
Unlocking Educational Potential: Exploring Students’ Satisfaction and Sustainable Engagement with ChatGPT Using the ECM Model
Unlocking Educational Potential: Exploring Students’ Satisfaction and Sustainable Engagement with ChatGPT Using the ECM Model
Aim/Purpose: The main goal of this study is to investigate the factors affecting students’ satisfaction and continuous usage of ChatGPT in an educational context, using the Expecta...
User Intentions to Use ChatGPT for Self-Diagnosis and Health-Related Purposes: Cross-sectional Survey Study (Preprint)
User Intentions to Use ChatGPT for Self-Diagnosis and Health-Related Purposes: Cross-sectional Survey Study (Preprint)
BACKGROUND With the rapid advancement of artificial intelligence (AI) technologies, AI-powered chatbots, such as Chat Generative Pretrained Transformer (Cha...
Appearance of ChatGPT and English Study
Appearance of ChatGPT and English Study
The purpose of this study is to examine the definition and characteristics of ChatGPT in order to present the direction of self-directed learning to learners, and to explore the po...
ChatGPT Versus Consultants: Blinded Evaluation on Answering Otorhinolaryngology Case–Based Questions (Preprint)
ChatGPT Versus Consultants: Blinded Evaluation on Answering Otorhinolaryngology Case–Based Questions (Preprint)
BACKGROUND Large language models (LLMs), such as ChatGPT (Open AI), are increasingly used in medicine and supplement standard search engines as information ...

Back to Top