Javascript must be enabled to continue!
Research on the Application of Generative Artificial Intelligence to Evaluate Responses Related to Questions About COVID-19 in Terms of Their Accuracy and Readability
View through CrossRef
Objective: This study aims to compare the accuracy and readability of COVID-19 infectious disease prevention and control knowledge generated by four major generative artificial intelligence models—two international models (ChatGPT and Gemini) and two domestic models (Kimi and Ernie Bot)—to evaluate the performance characteristics of domestic and international models. Methods: The knowledge Q&A from the COVID-19 prevention guidelines issued by the U.S. Centers for Disease Control and Prevention (CDC) was used as the evaluation standard. The texts generated by the four models were compared with the standard in terms of accuracy, readability, and understandability. Then, a neural network model based on intelligent algorithms was used to extract the factors influencing the readability of the generated texts. Finally, text analysis was applied to explore the medical topics in the generated texts. Results: Text accuracy.Domestic models showed higher accuracy in generated texts, while in-ternational models demonstrated better reliability. Text readability.Domestic models produced fluent language and a style suitable for public reading; international models exhibited better stability and tended to generate formal documentation. Text under-standability.Domestic models had better readability; international models had more stable output. Readability influencing factors.The sentence length indicator (AWPS) of texts generated by both domestic and international models was the most important factor affecting readability. Topic analysis: ChatGPT focused more on epidemiological knowledge; Gemini on the healthcare field; Kimi on multidisciplinary information; and Ernie Bot on clinical medical topics. Conclusion: Texts generated by domestic models are easy to understand and more suitable for public reading, and are better suited for clinical testing, health consultation, and similar applications. Texts generated by in-ternational models have higher accuracy and professionalism, focusing more on epidemiological analysis, disease severity assessment, and related fields. Based on the findings, it is recommended that infectious disease prevention knowledge systems—such as those for COVID-19—should pay more attention to the public's knowledge base and comprehension level, achieving an organic integration of professionalism and accessibility in AI-generated knowledge, thereby providing objective reference materials for future major infectious disease outbreaks.
Title: Research on the Application of Generative Artificial Intelligence to Evaluate Responses Related to Questions About COVID-19 in Terms of Their Accuracy and Readability
Description:
Objective: This study aims to compare the accuracy and readability of COVID-19 infectious disease prevention and control knowledge generated by four major generative artificial intelligence models—two international models (ChatGPT and Gemini) and two domestic models (Kimi and Ernie Bot)—to evaluate the performance characteristics of domestic and international models.
Methods: The knowledge Q&A from the COVID-19 prevention guidelines issued by the U.
S.
Centers for Disease Control and Prevention (CDC) was used as the evaluation standard.
The texts generated by the four models were compared with the standard in terms of accuracy, readability, and understandability.
Then, a neural network model based on intelligent algorithms was used to extract the factors influencing the readability of the generated texts.
Finally, text analysis was applied to explore the medical topics in the generated texts.
Results: Text accuracy.
Domestic models showed higher accuracy in generated texts, while in-ternational models demonstrated better reliability.
Text readability.
Domestic models produced fluent language and a style suitable for public reading; international models exhibited better stability and tended to generate formal documentation.
Text under-standability.
Domestic models had better readability; international models had more stable output.
Readability influencing factors.
The sentence length indicator (AWPS) of texts generated by both domestic and international models was the most important factor affecting readability.
Topic analysis: ChatGPT focused more on epidemiological knowledge; Gemini on the healthcare field; Kimi on multidisciplinary information; and Ernie Bot on clinical medical topics.
Conclusion: Texts generated by domestic models are easy to understand and more suitable for public reading, and are better suited for clinical testing, health consultation, and similar applications.
Texts generated by in-ternational models have higher accuracy and professionalism, focusing more on epidemiological analysis, disease severity assessment, and related fields.
Based on the findings, it is recommended that infectious disease prevention knowledge systems—such as those for COVID-19—should pay more attention to the public's knowledge base and comprehension level, achieving an organic integration of professionalism and accessibility in AI-generated knowledge, thereby providing objective reference materials for future major infectious disease outbreaks.
Related Results
A Comparative Study of the Accuracy and Readability of Responses from Four Generative AI Models to COVID-19-Related Questions
A Comparative Study of the Accuracy and Readability of Responses from Four Generative AI Models to COVID-19-Related Questions
The purpose of this study is to compare the accuracy and readability of Coronavirus Disease 2019 (COVID-19)-prevention and control knowledge texts generated by four current generat...
Assessment of Chat-GPT, Gemini, and Perplexity in Principle of Research Publication: A Comparative Study
Assessment of Chat-GPT, Gemini, and Perplexity in Principle of Research Publication: A Comparative Study
Abstract
Introduction
Many researchers utilize artificial intelligence (AI) to aid their research endeavors. This study seeks to assess and contrast the performance of three sophis...
Post-Pandemic Support for Special Populations in Higher Education through Generative Artificial Intelligence
Post-Pandemic Support for Special Populations in Higher Education through Generative Artificial Intelligence
The sudden closure of schools in response to the COVID-19 pandemic prompted education authorities to quickly explore new teaching and learning methods. This disruption to tradition...
P-525 ChatGPT 4.0: accurate, clear, relevant, and readable responses to frequently asked fertility patient questions
P-525 ChatGPT 4.0: accurate, clear, relevant, and readable responses to frequently asked fertility patient questions
Abstract
Study question
What is the accuracy, clarity, relevance and readability of ChatGPT’s responses to frequently asked fert...
(021) ChatGPT's Ability to Assess Quality and Readability of Online Medical Information
(021) ChatGPT's Ability to Assess Quality and Readability of Online Medical Information
Abstract
Introduction
Health literacy plays a crucial role in enabling patients to understand and effectively use medical inform...
PERSEPSI IBU HAMIL TENTANG VAKSIN COVID-19 TERHADAP PELAKSANAAN VAKSINASI COVID-19
PERSEPSI IBU HAMIL TENTANG VAKSIN COVID-19 TERHADAP PELAKSANAAN VAKSINASI COVID-19
Latar Belakang: kasus positif Covid-19 di Kabupaten Sukoharjo tahun 2021 mencapai 12.350 dan terus mengalami penambahan jumlah. Dari jumlah tersebut terdapat 168 kasus positif Covi...
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract
Introduction
The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study
Artificial intelligence (AI) and the introduction of Large Language Model (LLM) chatbots have become a common source of patient inquiry in healthcare. The quality and readability o...

