Javascript must be enabled to continue!

How AI Responds to Obstetric Ultrasound Questions and Analyzes and Explains Obstetric Ultrasound Reports: ChatGPT-3.5 vs. Microsoft Copilot in Bing

Abstract Objectives: To evaluate and compare the accuracy and consistency of answers to obstetric ultrasound questions and analysis of obstetric ultrasound reports using publicly available ChatGPT-3.5 and Microsoft Copilot in Bing (Copilot). Methods: Twenty questions related to obstetric ultrasound were answered and 110 obstetric ultrasound reports were analyzed by both ChatGPT-3.5 and Copilot, with each question and report being posed three times to them at different times. The accuracy and consistency of each response to twenty questions and each analysis result in the report were evaluated and compared. Results: In answering twenty questions, ChatGPT-3.5 outperformed Copilot in both accuracy (95.0% vs. 80.0%) and consistency (90.0% vs. 75.0%). When analyzing obstetric ultrasound reports, two models performed similarly in accuracy and consistency, and can provide recommendations. The overall accuracy and consistency of ChatGPT-3.5 and Copilot were 83.86%, 87.30% vs 77.51%, 90.48%, respectively. However, in detecting abnormal amniotic fluid index, ChatGPT-3.5 was superior to Copilot (accuracy 87.50% vs. 66.67%, P < 0.05). Conclusion: While ChatGPT-3.5 and Copilot can provide valuable explanations on obstetric ultrasound and interpret most obstetric ultrasound reports accurately, neither model consistently answered all questions correctly or with complete consistency, the supervision of physician is crucial in the use of these models.

Springer Science and Business Media LLC

Yanran Du Chao Ji Jiale Xu Minyan Wei Yunyun Ren Shujun Xia JianQiao Zhou

2024

Title: How AI Responds to Obstetric Ultrasound Questions and Analyzes and Explains Obstetric Ultrasound Reports: ChatGPT-3.5 vs. Microsoft Copilot in Bing

Description:

Abstract Objectives: To evaluate and compare the accuracy and consistency of answers to obstetric ultrasound questions and analysis of obstetric ultrasound reports using publicly available ChatGPT-3.

5 and Microsoft Copilot in Bing (Copilot).

Methods: Twenty questions related to obstetric ultrasound were answered and 110 obstetric ultrasound reports were analyzed by both ChatGPT-3.

5 and Copilot, with each question and report being posed three times to them at different times.

The accuracy and consistency of each response to twenty questions and each analysis result in the report were evaluated and compared.

Results: In answering twenty questions, ChatGPT-3.

5 outperformed Copilot in both accuracy (95.

0% vs.

80.

0%) and consistency (90.

0% vs.

75.

0%).

When analyzing obstetric ultrasound reports, two models performed similarly in accuracy and consistency, and can provide recommendations.

The overall accuracy and consistency of ChatGPT-3.

5 and Copilot were 83.

86%, 87.

30% vs 77.

51%, 90.

48%, respectively.

However, in detecting abnormal amniotic fluid index, ChatGPT-3.

5 was superior to Copilot (accuracy 87.

50% vs.

66.

67%, P < 0.

05).

Conclusion: While ChatGPT-3.

5 and Copilot can provide valuable explanations on obstetric ultrasound and interpret most obstetric ultrasound reports accurately, neither model consistently answered all questions correctly or with complete consistency, the supervision of physician is crucial in the use of these models.

Back

Abstract Introduction The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...

Performance of ChatGPT and Microsoft Copilot in Bing in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports

Abstract To evaluate and compare the performance of publicly available ChatGPT-3.5, ChatGPT-4.0 and Microsoft Copilot in Bing (Copilot) in answering obstetric ultrasound ...

Assessment of Chat-GPT, Gemini, and Perplexity in Principle of Research Publication: A Comparative Study

Abstract Introduction Many researchers utilize artificial intelligence (AI) to aid their research endeavors. This study seeks to assess and contrast the performance of three sophis...

Five advanced chatbots solving European Diploma in Radiology (EDiR) text-based questions: differences in performance and consistency

Abstract Background We compared the performance, confidence, and response consistency of five chatbots powered by large language models in solvin...

CHATGPT ASSISTANCE ON BIOCHEMISTRY LEARNING OUTCOMES OF PRE-SERVICE TEACHERS

This research investigates the effect of ChatGPT on the learning outcomes of pre-service biology teachers. Sampling was done by purposive sampling in class A (treated with ChatGPT)...

Appearance of ChatGPT and English Study

The purpose of this study is to examine the definition and characteristics of ChatGPT in order to present the direction of self-directed learning to learners, and to explore the po...

Assessment of Nursing Skill and Knowledge of ChatGPT, Gemini, Microsoft Copilot, and Llama: A Comparative Study

Abstract Introduction Artificial intelligence (AI) has emerged as a transformative force in healthcare. This study assesses the performance of advanced AI systems—ChatGPT-3.5, Gemi...

P-525 ChatGPT 4.0: accurate, clear, relevant, and readable responses to frequently asked fertility patient questions

Abstract Study question What is the accuracy, clarity, relevance and readability of ChatGPT’s responses to frequently asked fert...

Email:
Password:

Email:

How AI Responds to Obstetric Ultrasound Questions and Analyzes and Explains Obstetric Ultrasound Reports: ChatGPT-3.5 vs. Microsoft Copilot in Bing

Related Results