Javascript must be enabled to continue!

Evaluating the Quality and Readability of AI-Generated Patient Education Materials for Hysterectomy [ID 1378]

INTRODUCTION: Surgical counseling for hysterectomy can overwhelm patients, making supplemental materials valuable. We evaluated the quality and readability of hysterectomy educational materials generated by three AI chatbots. METHODS: We prompted three AI chatbots—ChatGPT, Perplexity, and Microsoft Bing AI—to answer questions from the UpToDate patient handout “Hysterectomy: (The Basics)” for comparison. Three independent scorers evaluated quality using the DISCERN instrument. Readability was assessed using the Flesch–Kincaid Grade Level (FKGL), Flesch–Kincaid Reading Ease (FKRE), and Gunning Fog Index (GFI). RESULTS: The mean UpToDate DISCERN score was 3.7 (SD 0.58) compared to the mean DISCERN score for the chatbots of 3.25 (SD 0.71). The mean chatbot FKRE scores were 29.18 (SD 4) (college graduate-level). UpToDate’s mean FKGL was 6.9 (SD 0.06), indicating a middle school reading level, whereas the chatbots scored 13.4 (ChatGPT), 11.9 (Perplexity), and 12.6 (Microsoft), corresponding to college level. The mean GFI was 9.1 (SD 1.0) for UpToDate (grade 9), and the mean was 20.2 (SD 8.3) (university level) for the chatbots. CONCLUSIONS/IMPLICATIONS: According to the DISCERN tool, the quality of AI-assisted and UpToDate patient information on hysterectomy is of medium quality. Queries generated by chatbots demonstrate higher reading levels and exhibit lower reliability and accuracy than corresponding information from UpToDate. Due to the nascent nature of this technology, AI-generated content on hysterectomy should not be relied upon as a direct resource for patient care.

Ovid Technologies (Wolters Kluwer Health)

Amy M. Huddleson Gabriella Philip Ananda Thomas Amy George

Obstetrics & Gynecology

2025

Title: Evaluating the Quality and Readability of AI-Generated Patient Education Materials for Hysterectomy [ID 1378]

Description:

INTRODUCTION: Surgical counseling for hysterectomy can overwhelm patients, making supplemental materials valuable.

We evaluated the quality and readability of hysterectomy educational materials generated by three AI chatbots.

METHODS: We prompted three AI chatbots—ChatGPT, Perplexity, and Microsoft Bing AI—to answer questions from the UpToDate patient handout “Hysterectomy: (The Basics)” for comparison.

Three independent scorers evaluated quality using the DISCERN instrument.

Readability was assessed using the Flesch–Kincaid Grade Level (FKGL), Flesch–Kincaid Reading Ease (FKRE), and Gunning Fog Index (GFI).

RESULTS: The mean UpToDate DISCERN score was 3.

7 (SD 0.

58) compared to the mean DISCERN score for the chatbots of 3.

25 (SD 0.

71).

The mean chatbot FKRE scores were 29.

18 (SD 4) (college graduate-level).

UpToDate’s mean FKGL was 6.

9 (SD 0.

06), indicating a middle school reading level, whereas the chatbots scored 13.

4 (ChatGPT), 11.

9 (Perplexity), and 12.

6 (Microsoft), corresponding to college level.

The mean GFI was 9.

1 (SD 1.

0) for UpToDate (grade 9), and the mean was 20.

2 (SD 8.

3) (university level) for the chatbots.

CONCLUSIONS/IMPLICATIONS: According to the DISCERN tool, the quality of AI-assisted and UpToDate patient information on hysterectomy is of medium quality.

Queries generated by chatbots demonstrate higher reading levels and exhibit lower reliability and accuracy than corresponding information from UpToDate.

Due to the nascent nature of this technology, AI-generated content on hysterectomy should not be relied upon as a direct resource for patient care.

Back

Related Results

Autonomy on Trial

Photo by CHUTTERSNAP on Unsplash Abstract This paper critically examines how US bioethics and health law conceptualize patient autonomy, contrasting the rights-based, individualist...

Hysterectomy and mental health status, findings from Ardakan Cohort Study on Aging (ACSA)

Abstract Background Many middle-aged and older women have undergone hysterectomy in their lifetime. The mental health outcomes of hysterectomy are controversial. This stud...

Radical Hysterectomy Versus Simple Hysterectomy and Brachytherapy for Patients with Stage II Endometrial Cancer

Abstract BACKGROUND: To compare the survival outcome between radical hysterectomy and simple hysterectomy with implants radiation in patients with stage II endometrial canc...

ABDOMINAL VERSUS VAGINAL HYSTERECTOMY;

Objective: To evaluate abdominal versus vaginal hysterectomy in relation to operative and post operative complications.Design: Single centre cross sectional study. Place and durati...

Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study

Artificial intelligence (AI) and the introduction of Large Language Model (LLM) chatbots have become a common source of patient inquiry in healthcare. The quality and readability o...

Research on the Application of Generative Artificial Intelligence to Evaluate Responses Related to Questions About COVID-19 in Terms of Their Accuracy and Readability

Objective: This study aims to compare the accuracy and readability of COVID-19 infectious disease prevention and control knowledge generated by four major generative artificial int...

A Comparative Study of the Accuracy and Readability of Responses from Four Generative AI Models to COVID-19-Related Questions

The purpose of this study is to compare the accuracy and readability of Coronavirus Disease 2019 (COVID-19)-prevention and control knowledge texts generated by four current generat...

(021) ChatGPT's Ability to Assess Quality and Readability of Online Medical Information

Abstract Introduction Health literacy plays a crucial role in enabling patients to understand and effectively use medical inform...

Email:
Password:

Email: