Javascript must be enabled to continue!
Performance of AI ‐Chatbots to Common Temporomandibular Joint Disorders ( TMDs ) Patient Queries: Accuracy, Completeness, Reliability and Readability
View through CrossRef
ABSTRACT
TMDs are a common group of conditions affecting the temporomandibular joint (TMJ) often resulting from factors like injury, stress or teeth grinding. This study aimed to evaluate the accuracy, completeness, reliability and readability of the responses generated by ChatGPT‐3.5, –4o and Google Gemini to TMD‐related inquiries. Forty‐five questions covering various aspects of TMDs were created by two experts and submitted by one author to ChatGPT‐3.5, ChatGPT‐4 and Google Gemini on the same day. The responses were evaluated for accuracy, completeness and reliability using modified Likert scales. Readability was analysed with six validated indices via a specialised tool. Additional features, such as the inclusion of graphical elements, references and safeguard mechanisms, were also documented and analysed. The Pearson Chi‐Square and One‐Way ANOVA tests were used for data analysis. Google Gemini achieved the highest accuracy, providing 100% correct responses, followed by ChatGPT‐3.5 (95.6%) and ChatGPT–4o (93.3%). ChatGPT–4o provided the most complete responses (91.1%), followed by ChatGPT‐03 (64.4%) and Google Gemini (42.2%). The majority of responses were reliable, with ChatGPT–4o at 93.3% ‘Absolutely Reliable’, compared to 46.7% for ChatGPT‐3.5 and 48.9% for Google Gemini. Both ChatGPT–4o and Google Gemini included references in responses, 22.2% and 13.3%, respectively, while ChatGPT‐3.5 included none. Google Gemini was the only model that included multimedia (6.7%). Readability scores were highest for ChatGPT‐3.5, suggesting its responses were more complex than those of Google Gemini and ChatGPT–4o. Both ChatGPT–4o and Google Gemini demonstrated accuracy and reliability in addressing TMD‐related questions, with their responses being clear, easy to understand and complemented by safeguard statements encouraging specialist consultation. However, both platforms lacked evidence‐based references. Only Google Gemini incorporated multimedia elements into its answers.
Title: Performance of
AI
‐Chatbots to Common Temporomandibular Joint Disorders (
TMDs
) Patient Queries: Accuracy, Completeness, Reliability and Readability
Description:
ABSTRACT
TMDs are a common group of conditions affecting the temporomandibular joint (TMJ) often resulting from factors like injury, stress or teeth grinding.
This study aimed to evaluate the accuracy, completeness, reliability and readability of the responses generated by ChatGPT‐3.
5, –4o and Google Gemini to TMD‐related inquiries.
Forty‐five questions covering various aspects of TMDs were created by two experts and submitted by one author to ChatGPT‐3.
5, ChatGPT‐4 and Google Gemini on the same day.
The responses were evaluated for accuracy, completeness and reliability using modified Likert scales.
Readability was analysed with six validated indices via a specialised tool.
Additional features, such as the inclusion of graphical elements, references and safeguard mechanisms, were also documented and analysed.
The Pearson Chi‐Square and One‐Way ANOVA tests were used for data analysis.
Google Gemini achieved the highest accuracy, providing 100% correct responses, followed by ChatGPT‐3.
5 (95.
6%) and ChatGPT–4o (93.
3%).
ChatGPT–4o provided the most complete responses (91.
1%), followed by ChatGPT‐03 (64.
4%) and Google Gemini (42.
2%).
The majority of responses were reliable, with ChatGPT–4o at 93.
3% ‘Absolutely Reliable’, compared to 46.
7% for ChatGPT‐3.
5 and 48.
9% for Google Gemini.
Both ChatGPT–4o and Google Gemini included references in responses, 22.
2% and 13.
3%, respectively, while ChatGPT‐3.
5 included none.
Google Gemini was the only model that included multimedia (6.
7%).
Readability scores were highest for ChatGPT‐3.
5, suggesting its responses were more complex than those of Google Gemini and ChatGPT–4o.
Both ChatGPT–4o and Google Gemini demonstrated accuracy and reliability in addressing TMD‐related questions, with their responses being clear, easy to understand and complemented by safeguard statements encouraging specialist consultation.
However, both platforms lacked evidence‐based references.
Only Google Gemini incorporated multimedia elements into its answers.
Related Results
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Objective: To determine the frequency of common chromosomal aberrations in local population idiopathic determine the frequency of common chromosomal aberrations in local population...
Revolutionizing public health: The importance of chatbots
Revolutionizing public health: The importance of chatbots
Introduction:
Public health is a crucial aspect of maintaining the well-being and health of the community. The ever-growing demands of the modern w...
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study
Artificial intelligence (AI) and the introduction of Large Language Model (LLM) chatbots have become a common source of patient inquiry in healthcare. The quality and readability o...
Autonomy on Trial
Autonomy on Trial
Photo by CHUTTERSNAP on Unsplash
Abstract
This paper critically examines how US bioethics and health law conceptualize patient autonomy, contrasting the rights-based, individualist...
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Abstract
Thoracic outlet syndrome (TOS) is a complex and often overlooked condition caused by the compression of neurovascular structures as they pass through the thoracic outlet. ...
Domination of Polynomial with Application
Domination of Polynomial with Application
In this paper, .We .initiate the study of domination. polynomial , consider G=(V,E) be a simple, finite, and directed graph without. isolated. vertex .We present a study of the Ira...
Articular Eminence Inclination and Glenoid Fossa Measurements by CBCT in Patients with Temporomandibular Joint Disorders
Articular Eminence Inclination and Glenoid Fossa Measurements by CBCT in Patients with Temporomandibular Joint Disorders
Background: The increasing frequency of temporomandibular joint dysfunction requires the promotion of diagnostic and therapeutic approaches. The several etiologies of dysfunction a...
App review of anxiety and depression chatbots and their self-care features (Preprint)
App review of anxiety and depression chatbots and their self-care features (Preprint)
BACKGROUND
Anxiety and depression rates are at an all-time high along with other mental health disorders. Smartphone-based mental health chatbots or convers...

