Javascript must be enabled to continue!
Evaluating the Quality and Readability of AI-Generated Patient Education Materials for Hysterectomy [ID 1378]
View through CrossRef
INTRODUCTION:
Surgical counseling for hysterectomy can overwhelm patients, making supplemental materials valuable. We evaluated the quality and readability of hysterectomy educational materials generated by three AI chatbots.
METHODS:
We prompted three AI chatbots—ChatGPT, Perplexity, and Microsoft Bing AI—to answer questions from the UpToDate patient handout “Hysterectomy: (The Basics)” for comparison. Three independent scorers evaluated quality using the DISCERN instrument. Readability was assessed using the Flesch–Kincaid Grade Level (FKGL), Flesch–Kincaid Reading Ease (FKRE), and Gunning Fog Index (GFI).
RESULTS:
The mean UpToDate DISCERN score was 3.7 (SD 0.58) compared to the mean DISCERN score for the chatbots of 3.25 (SD 0.71). The mean chatbot FKRE scores were 29.18 (SD 4) (college graduate-level). UpToDate’s mean FKGL was 6.9 (SD 0.06), indicating a middle school reading level, whereas the chatbots scored 13.4 (ChatGPT), 11.9 (Perplexity), and 12.6 (Microsoft), corresponding to college level. The mean GFI was 9.1 (SD 1.0) for UpToDate (grade 9), and the mean was 20.2 (SD 8.3) (university level) for the chatbots.
CONCLUSIONS/IMPLICATIONS:
According to the DISCERN tool, the quality of AI-assisted and UpToDate patient information on hysterectomy is of medium quality. Queries generated by chatbots demonstrate higher reading levels and exhibit lower reliability and accuracy than corresponding information from UpToDate. Due to the nascent nature of this technology, AI-generated content on hysterectomy should not be relied upon as a direct resource for patient care.
Ovid Technologies (Wolters Kluwer Health)
Title: Evaluating the Quality and Readability of AI-Generated Patient Education Materials for Hysterectomy [ID 1378]
Description:
INTRODUCTION:
Surgical counseling for hysterectomy can overwhelm patients, making supplemental materials valuable.
We evaluated the quality and readability of hysterectomy educational materials generated by three AI chatbots.
METHODS:
We prompted three AI chatbots—ChatGPT, Perplexity, and Microsoft Bing AI—to answer questions from the UpToDate patient handout “Hysterectomy: (The Basics)” for comparison.
Three independent scorers evaluated quality using the DISCERN instrument.
Readability was assessed using the Flesch–Kincaid Grade Level (FKGL), Flesch–Kincaid Reading Ease (FKRE), and Gunning Fog Index (GFI).
RESULTS:
The mean UpToDate DISCERN score was 3.
7 (SD 0.
58) compared to the mean DISCERN score for the chatbots of 3.
25 (SD 0.
71).
The mean chatbot FKRE scores were 29.
18 (SD 4) (college graduate-level).
UpToDate’s mean FKGL was 6.
9 (SD 0.
06), indicating a middle school reading level, whereas the chatbots scored 13.
4 (ChatGPT), 11.
9 (Perplexity), and 12.
6 (Microsoft), corresponding to college level.
The mean GFI was 9.
1 (SD 1.
0) for UpToDate (grade 9), and the mean was 20.
2 (SD 8.
3) (university level) for the chatbots.
CONCLUSIONS/IMPLICATIONS:
According to the DISCERN tool, the quality of AI-assisted and UpToDate patient information on hysterectomy is of medium quality.
Queries generated by chatbots demonstrate higher reading levels and exhibit lower reliability and accuracy than corresponding information from UpToDate.
Due to the nascent nature of this technology, AI-generated content on hysterectomy should not be relied upon as a direct resource for patient care.
Related Results
Autonomy on Trial
Autonomy on Trial
Photo by CHUTTERSNAP on Unsplash
Abstract
This paper critically examines how US bioethics and health law conceptualize patient autonomy, contrasting the rights-based, individualist...
Hysterectomy and mental health status, findings from Ardakan Cohort Study on Aging (ACSA)
Hysterectomy and mental health status, findings from Ardakan Cohort Study on Aging (ACSA)
Abstract
Background
Many middle-aged and older women have undergone hysterectomy in their lifetime. The mental health outcomes of hysterectomy are controversial. This stud...
Radical Hysterectomy Versus Simple Hysterectomy and Brachytherapy for Patients with Stage II Endometrial Cancer
Radical Hysterectomy Versus Simple Hysterectomy and Brachytherapy for Patients with Stage II Endometrial Cancer
Abstract
BACKGROUND: To compare the survival outcome between radical hysterectomy and simple hysterectomy with implants radiation in patients with stage II endometrial canc...
ABDOMINAL VERSUS VAGINAL HYSTERECTOMY;
ABDOMINAL VERSUS VAGINAL HYSTERECTOMY;
Objective: To evaluate abdominal versus vaginal hysterectomy in relation to operative and post operative complications.Design: Single centre cross sectional study. Place and durati...
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study
Artificial intelligence (AI) and the introduction of Large Language Model (LLM) chatbots have become a common source of patient inquiry in healthcare. The quality and readability o...
Research on the Application of Generative Artificial Intelligence to Evaluate Responses Related to Questions About COVID-19 in Terms of Their Accuracy and Readability
Research on the Application of Generative Artificial Intelligence to Evaluate Responses Related to Questions About COVID-19 in Terms of Their Accuracy and Readability
Objective: This study aims to compare the accuracy and readability of COVID-19 infectious disease prevention and control knowledge generated by four major generative artificial int...
A Comparative Study of the Accuracy and Readability of Responses from Four Generative AI Models to COVID-19-Related Questions
A Comparative Study of the Accuracy and Readability of Responses from Four Generative AI Models to COVID-19-Related Questions
The purpose of this study is to compare the accuracy and readability of Coronavirus Disease 2019 (COVID-19)-prevention and control knowledge texts generated by four current generat...
(021) ChatGPT's Ability to Assess Quality and Readability of Online Medical Information
(021) ChatGPT's Ability to Assess Quality and Readability of Online Medical Information
Abstract
Introduction
Health literacy plays a crucial role in enabling patients to understand and effectively use medical inform...

