Javascript must be enabled to continue!

Human-supervised, large language model-based clinical decision support aligned to national newborn protocols in Kenya: a pragmatic, early-stage evaluation

Abstract Introduction Timely, protocol-adherent clinical decisions are crucial for reducing neonatal mortality in low-resource settings. Translating extensive national guidelines into bedside practice remains challenging. Objective We developed and evaluated AIFYA, a human-supervised, large language model (LLM)-based clinical decision support system (CDSS) aligned with Kenya’s national newborn care protocols. Methods This prospective, mixed-methods, early-stage evaluation, guided by the DECIDE-AI framework, embedded AIFYA into routine workflows at two public health facilities (Level 5 and Level 4) in Bungoma County, Kenya, from September 2024 to June 2025. Primary outcomes were: (1) adoption, measured by cumulative neonatal cases managed; (2) training reach, assessed by credentialed healthcare workers (HCWs); and (3) guideline and citation concordance, evaluated through blinded review of 118 AI-generated recommendations by two neonatologists, with adjudication by a third. Secondary outcomes included protocol adherence and triage-to-decision time Results A total of 50 HCWs were trained, and 550 neonatal cases were managed over 10 months. Among surveyed HCWs (n = 33), 76% were female (mean age 32.1 years). Expert review found 75% of recommendations were correct and 15% partially correct, with strong interrater reliability (weighted Cohen’s kappa 0.85; 95% CI 0.79–0.91). Citation accuracy was 96%. In 40 complex dosing scenarios, 75% of outputs were rated correct. The median triage-to-decision time was 23 minutes (IQR 18–31). Implementation was supported by an offline-first architecture and a facility-based coaching model, sustaining engagement despite staff turnover. Conclusion A human-supervised AI CDSS, directly and transparently anchored to national clinical guidelines, can be successfully implemented in routine, low-resource neonatal care settings. The system demonstrated high user adoption and strong expert-rated concordance. The high citation accuracy is a critical feature that builds clinical trust, ensuring safety and enabling auditable AI. These findings provide a robust foundation for progression to controlled, multi-site trials to evaluate clinical effectiveness.

openRxiv

Teresia Kuria Gideon Kamau Felisters Makokha Protus Omondi George Mbugua David Kamau Samuel Mbugua Jesse Gitaka

2026

Title: Human-supervised, large language model-based clinical decision support aligned to national newborn protocols in Kenya: a pragmatic, early-stage evaluation

Description:

Abstract Introduction Timely, protocol-adherent clinical decisions are crucial for reducing neonatal mortality in low-resource settings.

Translating extensive national guidelines into bedside practice remains challenging.

Objective We developed and evaluated AIFYA, a human-supervised, large language model (LLM)-based clinical decision support system (CDSS) aligned with Kenya’s national newborn care protocols.

Methods This prospective, mixed-methods, early-stage evaluation, guided by the DECIDE-AI framework, embedded AIFYA into routine workflows at two public health facilities (Level 5 and Level 4) in Bungoma County, Kenya, from September 2024 to June 2025.

Primary outcomes were: (1) adoption, measured by cumulative neonatal cases managed; (2) training reach, assessed by credentialed healthcare workers (HCWs); and (3) guideline and citation concordance, evaluated through blinded review of 118 AI-generated recommendations by two neonatologists, with adjudication by a third.

Secondary outcomes included protocol adherence and triage-to-decision time Results A total of 50 HCWs were trained, and 550 neonatal cases were managed over 10 months.

Among surveyed HCWs (n = 33), 76% were female (mean age 32.

1 years).

Expert review found 75% of recommendations were correct and 15% partially correct, with strong interrater reliability (weighted Cohen’s kappa 0.

85; 95% CI 0.

79–0.

91).

Citation accuracy was 96%.

In 40 complex dosing scenarios, 75% of outputs were rated correct.

The median triage-to-decision time was 23 minutes (IQR 18–31).

Implementation was supported by an offline-first architecture and a facility-based coaching model, sustaining engagement despite staff turnover.

Conclusion A human-supervised AI CDSS, directly and transparently anchored to national clinical guidelines, can be successfully implemented in routine, low-resource neonatal care settings.

The system demonstrated high user adoption and strong expert-rated concordance.

The high citation accuracy is a critical feature that builds clinical trust, ensuring safety and enabling auditable AI.

These findings provide a robust foundation for progression to controlled, multi-site trials to evaluate clinical effectiveness.

Back

<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...

Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga

The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...

Autonomy on Trial

Photo by CHUTTERSNAP on Unsplash Abstract This paper critically examines how US bioethics and health law conceptualize patient autonomy, contrasting the rights-based, individualist...

Abstract The rapid growth of open access publishing (OAP) has significantly improved the accessibility and dissemination of scientific knowledge. However, this expansion has also c...

Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program

Abstract Funding Acknowledgements Type of funding sources: None. INTRODUCTION Patients with heart failure (HF)...

Optimizing IETF multimedia signaling protocols and architectures in 3GPP networks : an evolutionary approach

Signaling in Next Generation IP-based networks heavily relies in the family of multimedia signaling protocols defined by IETF. Two of these signaling protocols are RTSP and SIP, wh...

Digital Entrepreneurship and Performance of the Insurance Industry Sector in Kenya

The insurance industry in Kenya has become very competitive due to the shrinking demand of noncompulsory insurance products and negative perception by the general public. To ensure...

Measuring coverage of WHO recommended Essential Newborn Care practices in the squatter settlements of Islamabad Capital Territory in Pakistan.

Abstract Background: Pakistan has shown significant progress in reducing child mortality, however, significant challenges exist in reducing neonatal mortality rate. WHO rec...

Email:
Password:

Email:

Human-supervised, large language model-based clinical decision support aligned to national newborn protocols in Kenya: a pragmatic, early-stage evaluation

Related Results