Javascript must be enabled to continue!
How Large Language Models Can Affect Clinical Reasoning: A Randomized Clinical Trial
View through CrossRef
Abstract
Importance
LLMs have encoded a vast array of medical knowledge and are being integrated into clinical settings as decision-support tools to improve physician performance across various aspects of care. However, evidence of the impact of LLMs on the clinical reasoning of physicians remains limited.
Objective
To evaluate the impact of LLM on core aspects of physicians’ clinical reasoning: diagnostic reasoning, information gathering, and management reasoning in primary care scenarios.
Design, Setting, and Participants
We conducted three identical randomized controlled trials (RCTs) in 2024–2025 with 249 physicians in Indonesia, Kenya, and the Netherlands. Participants completed four or five clinical vignettes designed to simulate real-world primary care consultations, with half randomized to have access to ChatGPT-4o.
Main Outcomes and Measures
Physician quality of care was evaluated using a rubric based on evidence-based clinical guidelines, scored across nine steps of the clinical reasoning process. Primary outcomes were quality scores for diagnostic reasoning, information gathering, and management. Secondary outcomes were quality per answer, number of answers, and less obvious answers.
Results
Access to LLMs enhanced information gathering and management reasoning across all countries. Physicians who were assigned the LLM achieved significantly better quality-of-care scores in diagnostic steps in Indonesia (
b
=7.9%, CI: 4.0% to 11.8%,
p
<.001) and Kenya (
b
= 15.1%, CI: 10.2% to 19.9%,
p
<.001) but not the Netherlands (
b
=1.4%, CI: −1.6% to 4.4%,
p
=1.00). Physicians with LLM access also performed better in investigative steps, in Indonesia (
b
=10.7%, CI: 4.2% to 17.1%,
p
=.004), Kenya (
b
=17.1%, CI: 10.3% to 23.9%,
p
<.001), and the Netherlands (
b
=11.9%, CI: 7.7% to 16.1%,
p
<.001). We also found LLM access affected physicians’ scores in management steps (Indonesia:
b
=15.7%, CI: 8.6% to 22.9%,
p
<.001; Kenya:
b
=27.3%, CI: 19.9% to 34.7%
p
<.001; the Netherlands:
b
=12.3%, CI: 7.1% to 17.5%,
p
<.001). We found that LLM access was less useful in management reasoning for more cognitively demanding cases compared to standard patient cases in Indonesia (
b
=-14.1%, CI: −21.4% to −6.8%,
p
<.001) and Kenya (
b
=-12.1%, CI: −19.6% to −4.6%,
p
=.006).
Conclusions and Relevance
In this cross-country randomized control trial, we assessed that access to an LLM had significant positive effects on physicians’ clinical reasoning. The effects we found are promising for the further roll-out of LLMs to supplement physicians in their care tasks. They also suggest that the extent to which LLMs can supplement physicians is context dependent.
Key Points
Question
To what extent do large language models (LLMs) increase physicians’ quality of diagnostic reasoning, information gathering and management reasoning?
Findings
In a randomized clinical trial including 249 physicians in Indonesia, Kenya, and the Netherlands, access to an LLM significantly enhanced clinical reasoning performance in information gathering and management reasoning across all countries, and diagnostic reasoning in Kenya and Indonesia.
Meaning
This study shows that the use of an LLM can enhance clinical reasoning of physicians. Further research is needed to effectively understand the augmentation of physician clinical practice.
Title: How Large Language Models Can Affect Clinical Reasoning: A Randomized Clinical Trial
Description:
Abstract
Importance
LLMs have encoded a vast array of medical knowledge and are being integrated into clinical settings as decision-support tools to improve physician performance across various aspects of care.
However, evidence of the impact of LLMs on the clinical reasoning of physicians remains limited.
Objective
To evaluate the impact of LLM on core aspects of physicians’ clinical reasoning: diagnostic reasoning, information gathering, and management reasoning in primary care scenarios.
Design, Setting, and Participants
We conducted three identical randomized controlled trials (RCTs) in 2024–2025 with 249 physicians in Indonesia, Kenya, and the Netherlands.
Participants completed four or five clinical vignettes designed to simulate real-world primary care consultations, with half randomized to have access to ChatGPT-4o.
Main Outcomes and Measures
Physician quality of care was evaluated using a rubric based on evidence-based clinical guidelines, scored across nine steps of the clinical reasoning process.
Primary outcomes were quality scores for diagnostic reasoning, information gathering, and management.
Secondary outcomes were quality per answer, number of answers, and less obvious answers.
Results
Access to LLMs enhanced information gathering and management reasoning across all countries.
Physicians who were assigned the LLM achieved significantly better quality-of-care scores in diagnostic steps in Indonesia (
b
=7.
9%, CI: 4.
0% to 11.
8%,
p
<.
001) and Kenya (
b
= 15.
1%, CI: 10.
2% to 19.
9%,
p
<.
001) but not the Netherlands (
b
=1.
4%, CI: −1.
6% to 4.
4%,
p
=1.
00).
Physicians with LLM access also performed better in investigative steps, in Indonesia (
b
=10.
7%, CI: 4.
2% to 17.
1%,
p
=.
004), Kenya (
b
=17.
1%, CI: 10.
3% to 23.
9%,
p
<.
001), and the Netherlands (
b
=11.
9%, CI: 7.
7% to 16.
1%,
p
<.
001).
We also found LLM access affected physicians’ scores in management steps (Indonesia:
b
=15.
7%, CI: 8.
6% to 22.
9%,
p
<.
001; Kenya:
b
=27.
3%, CI: 19.
9% to 34.
7%
p
<.
001; the Netherlands:
b
=12.
3%, CI: 7.
1% to 17.
5%,
p
<.
001).
We found that LLM access was less useful in management reasoning for more cognitively demanding cases compared to standard patient cases in Indonesia (
b
=-14.
1%, CI: −21.
4% to −6.
8%,
p
<.
001) and Kenya (
b
=-12.
1%, CI: −19.
6% to −4.
6%,
p
=.
006).
Conclusions and Relevance
In this cross-country randomized control trial, we assessed that access to an LLM had significant positive effects on physicians’ clinical reasoning.
The effects we found are promising for the further roll-out of LLMs to supplement physicians in their care tasks.
They also suggest that the extent to which LLMs can supplement physicians is context dependent.
Key Points
Question
To what extent do large language models (LLMs) increase physicians’ quality of diagnostic reasoning, information gathering and management reasoning?
Findings
In a randomized clinical trial including 249 physicians in Indonesia, Kenya, and the Netherlands, access to an LLM significantly enhanced clinical reasoning performance in information gathering and management reasoning across all countries, and diagnostic reasoning in Kenya and Indonesia.
Meaning
This study shows that the use of an LLM can enhance clinical reasoning of physicians.
Further research is needed to effectively understand the augmentation of physician clinical practice.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Characteristics and processes of registered nurses’ clinical reasoning and factors relating to the use of clinical reasoning in practice: a scoping review
Characteristics and processes of registered nurses’ clinical reasoning and factors relating to the use of clinical reasoning in practice: a scoping review
Objective:
The objective of this review was to examine the characteristics and processes of clinical reasoning used by registered nurses in clinical practice, and to id...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
International Breast Cancer Study Group (IBCSG)
International Breast Cancer Study Group (IBCSG)
This section provides current contact details and a summary of recent or ongoing clinical trials being coordinated by International Breast Cancer Study Group (IBCSG). Clinical tria...
Logical Challenges in Artificial General Intelligence
Logical Challenges in Artificial General Intelligence
The present thesis pertains to the research area of logic for artificial intelligence (AI), and is motivated by the critical role of automated reasoning in AI, particularly by the ...
Spanish Breast Cancer Research Group (GEICAM)
Spanish Breast Cancer Research Group (GEICAM)
This section provides current contact details and a summary of recent or ongoing clinical trials being coordinated by Spanish Breast Cancer Research Group (GEICAM). Clinical trials...
10 tips for clinical educators in designing and delivering learning experiences to improve clinical reasoning for medical students.
10 tips for clinical educators in designing and delivering learning experiences to improve clinical reasoning for medical students.
Background Clinical reasoning processes involve gathering and interpreting information, creating differential diagnoses and testing hypotheses to inform and guide patient managemen...

