Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Evaluating the evidence-based potential of six large language models in paediatric dentistry: a comparative study on generative artificial intelligence

View through CrossRef
Abstract Purpose The use of large language models (LLMs) in generative artificial intelligence (AI) is rapidly increasing in dentistry. However, their reliability is yet to be fully founded. This study aims to evaluate the diagnostic accuracy, clinical applicability, and patient education potential of LLMs in paediatric dentistry, by evaluating the responses of six LLMs: Google AI’s Gemini and Gemini Advanced, OpenAI’s ChatGPT-3.5, -4o and -4, and Microsoft’s Copilot. Methods Ten open-type clinical questions, relevant to paediatric dentistry were posed to the LLMs. The responses were graded by two independent evaluators from 0 to 10 using a detailed rubric. After 4 weeks, answers were reevaluated to assess intra-evaluator reliability. Statistical comparisons used Friedman’s and Wilcoxon’s and Kruskal–Wallis tests to assess the model that provided the most comprehensive, accurate, explicit and relevant answers. Results Variations of results were noted. Chat GPT 4 answers were scored as the best (average score 8.08), followed by the answers of Gemini Advanced (8.06), ChatGPT 4o (8.01), ChatGPT 3.5 (7.61), Gemini (7,32) and Copilot (5.41). Statistical analysis revealed that Chat GPT 4 outperformed all other LLMs, and the difference was statistically significant. Despite variations and different responses to the same queries, remarkable similarities were observed. Except for Copilot, all chatbots managed to achieve a score level above 6.5 on all queries. Conclusion This study demonstrates the potential use of language models (LLMs) in supporting evidence-based paediatric dentistry. Nevertheless, they cannot be regarded as completely trustworthy. Dental professionals should critically use AI models as supportive tools and not as a substitute of overall scientific knowledge and critical thinking.
Title: Evaluating the evidence-based potential of six large language models in paediatric dentistry: a comparative study on generative artificial intelligence
Description:
Abstract Purpose The use of large language models (LLMs) in generative artificial intelligence (AI) is rapidly increasing in dentistry.
However, their reliability is yet to be fully founded.
This study aims to evaluate the diagnostic accuracy, clinical applicability, and patient education potential of LLMs in paediatric dentistry, by evaluating the responses of six LLMs: Google AI’s Gemini and Gemini Advanced, OpenAI’s ChatGPT-3.
5, -4o and -4, and Microsoft’s Copilot.
Methods Ten open-type clinical questions, relevant to paediatric dentistry were posed to the LLMs.
The responses were graded by two independent evaluators from 0 to 10 using a detailed rubric.
After 4 weeks, answers were reevaluated to assess intra-evaluator reliability.
Statistical comparisons used Friedman’s and Wilcoxon’s and Kruskal–Wallis tests to assess the model that provided the most comprehensive, accurate, explicit and relevant answers.
Results Variations of results were noted.
Chat GPT 4 answers were scored as the best (average score 8.
08), followed by the answers of Gemini Advanced (8.
06), ChatGPT 4o (8.
01), ChatGPT 3.
5 (7.
61), Gemini (7,32) and Copilot (5.
41).
Statistical analysis revealed that Chat GPT 4 outperformed all other LLMs, and the difference was statistically significant.
Despite variations and different responses to the same queries, remarkable similarities were observed.
Except for Copilot, all chatbots managed to achieve a score level above 6.
5 on all queries.
Conclusion This study demonstrates the potential use of language models (LLMs) in supporting evidence-based paediatric dentistry.
Nevertheless, they cannot be regarded as completely trustworthy.
Dental professionals should critically use AI models as supportive tools and not as a substitute of overall scientific knowledge and critical thinking.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
OA27 Growth of the UK and Ireland paediatric rheumatology nurses’ group
OA27 Growth of the UK and Ireland paediatric rheumatology nurses’ group
Abstract Introduction/Background The Paediatric Rheumatology Clinical Nurse Specialist often has to manage a large caseload of c...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
Do evidence summaries increase health policy‐makers' use of evidence from systematic reviews? A systematic review
Do evidence summaries increase health policy‐makers' use of evidence from systematic reviews? A systematic review
This review summarizes the evidence from six randomized controlled trials that judged the effectiveness of systematic review summaries on policymakers' decision making, or the most...
Post-Pandemic Support for Special Populations in Higher Education through Generative Artificial Intelligence
Post-Pandemic Support for Special Populations in Higher Education through Generative Artificial Intelligence
The sudden closure of schools in response to the COVID-19 pandemic prompted education authorities to quickly explore new teaching and learning methods. This disruption to tradition...
Paediatric dentistry undergraduate education across dental schools in the Arabian region: a cross-sectional study
Paediatric dentistry undergraduate education across dental schools in the Arabian region: a cross-sectional study
Abstract Purpose To assess and compare teaching of paediatric dentistry in the undergraduate curriculum among dental schools in the Arabian region. ...

Back to Top