Javascript must be enabled to continue!

Neck dissection in head and neck surgery: An assessment of ChatGPT performance

Artificial intelligence models such as chat generative pre-trained transformer (ChatGPT) are being increasingly used to inform treatment-related decisions. Among otolaryngology subspecialties, there is a paucity of literature examining the role of ChatGPT within head and neck surgical oncology. The utility of ChatGPT in addressing questions related to surgically relevant anatomy and lymphadenectomy procedures remains poorly understood. The primary pilot study objective was to determine the reliability of ChatGPT in answering neck dissection-related inquiries compared to expert head and neck surgical oncologists. Five neck dissection-related questions were presented to ChatGPT v3.5. Three fellowship-trained head and neck surgeons compared AI-generated responses to those of an expert head and neck surgeon. Raters, blinded to the author’s identity, evaluated the responses given based on a Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). The median level of agreement between raters for the ChatGPT responses was 1.0 (interquartile range [IQR]: 1.0, 2.5; minimum = 1 and maximum = 4), while the median level of agreement between raters for the surgeon responses was 5.0 (IQR: 5.0, 5.0; minimum = 5 and maximum = 5). The Mann–Whitney U test yielded a significance level of p=0.007 when comparing the level of agreement between ChatGPT and surgeon responses. Raters showed minimal consistency when evaluating ChatGPT responses (intraclass correlation coefficient = 0.05; 95% confidence interval: 0.0–0.88), in contrast to perfect agreement observed for the surgeon responses. In summary, ChatGPT is a promising tool in the acquisition of surgical knowledge. For neck dissection-related inquiries, a discrepancy between the reliability of ChatGPT-generated responses and surgeon expertise exists. Further refinement in AI models is needed to strengthen the utility of ChatGPT in head and neck oncologic surgery.

AccScience Publishing

Dustin A. Silverman John S. Howard Priscilla F. A. Pichardo Yash J. Patil Mekibib Altaye Chad A. Zender Alice L. Tang

Artificial Intelligence in Health

2025

Title: Neck dissection in head and neck surgery: An assessment of ChatGPT performance

Description:

Artificial intelligence models such as chat generative pre-trained transformer (ChatGPT) are being increasingly used to inform treatment-related decisions.

Among otolaryngology subspecialties, there is a paucity of literature examining the role of ChatGPT within head and neck surgical oncology.

The utility of ChatGPT in addressing questions related to surgically relevant anatomy and lymphadenectomy procedures remains poorly understood.

The primary pilot study objective was to determine the reliability of ChatGPT in answering neck dissection-related inquiries compared to expert head and neck surgical oncologists.

Five neck dissection-related questions were presented to ChatGPT v3.

Three fellowship-trained head and neck surgeons compared AI-generated responses to those of an expert head and neck surgeon.

Raters, blinded to the author’s identity, evaluated the responses given based on a Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree).

The median level of agreement between raters for the ChatGPT responses was 1.

0 (interquartile range [IQR]: 1.

0, 2.

5; minimum = 1 and maximum = 4), while the median level of agreement between raters for the surgeon responses was 5.

0 (IQR: 5.

0, 5.

0; minimum = 5 and maximum = 5).

The Mann–Whitney U test yielded a significance level of p=0.

007 when comparing the level of agreement between ChatGPT and surgeon responses.

Raters showed minimal consistency when evaluating ChatGPT responses (intraclass correlation coefficient = 0.

05; 95% confidence interval: 0.

0–0.

88), in contrast to perfect agreement observed for the surgeon responses.

In summary, ChatGPT is a promising tool in the acquisition of surgical knowledge.

For neck dissection-related inquiries, a discrepancy between the reliability of ChatGPT-generated responses and surgeon expertise exists.

Further refinement in AI models is needed to strengthen the utility of ChatGPT in head and neck oncologic surgery.

Back

Abstract Introduction The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...

Assessment of Chat-GPT, Gemini, and Perplexity in Principle of Research Publication: A Comparative Study

Abstract Introduction Many researchers utilize artificial intelligence (AI) to aid their research endeavors. This study seeks to assess and contrast the performance of three sophis...

Unlocking Educational Potential: Exploring Students’ Satisfaction and Sustainable Engagement with ChatGPT Using the ECM Model

Aim/Purpose: The main goal of this study is to investigate the factors affecting students’ satisfaction and continuous usage of ChatGPT in an educational context, using the Expecta...

ChatGPT's Capabilities for Use in Anatomy Education and Anatomy Research

Dear Editors, Recently, the discussion of an artificial intelligence (AI) - fueled platform in several articles in your journal has attracted the attention of many researchers [1, ...

The performance of ChatGPT in day surgery and pre-anesthesia risk assessment: a case-control study across on 150 simulated patient presentations

Abstract Background Day surgery has developed rapidly in China in recent years, although it still faces the shortage of anesthesiologists to handle pre-anesthesia routine ...

Complex Collision Tumors: A Systematic Review

Abstract Introduction: A collision tumor consists of two distinct neoplastic components located within the same organ, separated by stromal tissue, without histological intermixing...

Appearance of ChatGPT and English Study

The purpose of this study is to examine the definition and characteristics of ChatGPT in order to present the direction of self-directed learning to learners, and to explore the po...

User Intentions to Use ChatGPT for Self-Diagnosis and Health-Related Purposes: Cross-sectional Survey Study (Preprint)

BACKGROUND With the rapid advancement of artificial intelligence (AI) technologies, AI-powered chatbots, such as Chat Generative Pretrained Transformer (Cha...

Email:
Password:

Email:

Neck dissection in head and neck surgery: An assessment of ChatGPT performance

Related Results