Javascript must be enabled to continue!
CardiacGPT™: A Real-Time AI Assistant for Intraoperative Guidance and Postoperative Decision Support in Cardiac Surgery
View through CrossRef
Abstract
Background
Cardiac surgery is one of the most complex and high-stakes areas of medicine, where intraoperative decisions must be made within seconds and incomplete information can compromise outcomes. Traditional risk scores and rule-based decision support tools provide limited real-time guidance and rarely integrate the unstructured data streams available during surgery. Recent advances in large language models (LLMs) such as OpenAI’s GPT-5 and Anthropic’s Claude 3.5 family have demonstrated state-of-the-art reasoning, summarization, and clinical dialogue capabilities. However, their safety and trustworthiness in surgical settings remain untested.
Objective
To evaluate the feasibility and clinician trustworthiness of CardiacGPT, a real-time AI assistant that leverages the newest-generation LLMs for intraoperative guidance and postoperative decision support.
Methods
We retrospectively analyzed 500 de-identified cardiac surgery cases from Brigham and Women’s Hospital, including CABG, valve, and combined procedures. Structured EHR variables, intraoperative monitoring, and operative notes were formatted into standardized prompts and processed through four cutting-edge models: OpenAI GPT-5, Anthropic Claude 3.5 Opus, Claude 3.5 Sonnet, and Claude 3.5 Haiku. Outputs were presented via a blinded Bidding App to attending cardiac surgeons and ICU clinicians, who scored trust and clinical relevance on a 5-point Likert scale. The primary outcome was the proportion of high-trust ratings (score > 4); secondary outcomes included mean trust scores, variance, and inter-rater reliability.
Results
Across 2,000 evaluations, GPT-5 and Claude 3.5 Opus achieved the highest mean trust scores (4.83 and 4.79, respectively), each exceeding 98% high-trust ratings. Claude 3.5 Sonnet performed moderately (mean 3.9, 74% high-trust), while Claude 3.5 Haiku produced less context-specific recommendations (mean 3.6, 66% high-trust). Inter-rater reliability was excellent, with ICC(2,1) = 0.91 (95% CI 0.88–0.94), confirming strong agreement among reviewers. Qualitative analysis showed that GPT-5 and Claude 3.5 Opus generated actionable and context-aware outputs, whereas smaller models often produced generic or incomplete guidance.
Conclusions
CardiacGPT, powered by the newest LLMs (GPT-5 and Claude 3.5 series), demonstrated feasibility and exceptionally high clinician trust across 500 real-world surgical cases. This is the first blinded, multi-model evaluation of next-generation LLMs for cardiac surgery. While outcome-based prospective trials are still required, these results establish CardiacGPT as a promising real-time co-pilot for cardiac surgeons, with the potential to reduce cognitive load, standardize intraoperative communication, and improve postoperative planning.
Title: CardiacGPT™: A Real-Time AI Assistant for Intraoperative Guidance and Postoperative Decision Support in Cardiac Surgery
Description:
Abstract
Background
Cardiac surgery is one of the most complex and high-stakes areas of medicine, where intraoperative decisions must be made within seconds and incomplete information can compromise outcomes.
Traditional risk scores and rule-based decision support tools provide limited real-time guidance and rarely integrate the unstructured data streams available during surgery.
Recent advances in large language models (LLMs) such as OpenAI’s GPT-5 and Anthropic’s Claude 3.
5 family have demonstrated state-of-the-art reasoning, summarization, and clinical dialogue capabilities.
However, their safety and trustworthiness in surgical settings remain untested.
Objective
To evaluate the feasibility and clinician trustworthiness of CardiacGPT, a real-time AI assistant that leverages the newest-generation LLMs for intraoperative guidance and postoperative decision support.
Methods
We retrospectively analyzed 500 de-identified cardiac surgery cases from Brigham and Women’s Hospital, including CABG, valve, and combined procedures.
Structured EHR variables, intraoperative monitoring, and operative notes were formatted into standardized prompts and processed through four cutting-edge models: OpenAI GPT-5, Anthropic Claude 3.
5 Opus, Claude 3.
5 Sonnet, and Claude 3.
5 Haiku.
Outputs were presented via a blinded Bidding App to attending cardiac surgeons and ICU clinicians, who scored trust and clinical relevance on a 5-point Likert scale.
The primary outcome was the proportion of high-trust ratings (score > 4); secondary outcomes included mean trust scores, variance, and inter-rater reliability.
Results
Across 2,000 evaluations, GPT-5 and Claude 3.
5 Opus achieved the highest mean trust scores (4.
83 and 4.
79, respectively), each exceeding 98% high-trust ratings.
Claude 3.
5 Sonnet performed moderately (mean 3.
9, 74% high-trust), while Claude 3.
5 Haiku produced less context-specific recommendations (mean 3.
6, 66% high-trust).
Inter-rater reliability was excellent, with ICC(2,1) = 0.
91 (95% CI 0.
88–0.
94), confirming strong agreement among reviewers.
Qualitative analysis showed that GPT-5 and Claude 3.
5 Opus generated actionable and context-aware outputs, whereas smaller models often produced generic or incomplete guidance.
Conclusions
CardiacGPT, powered by the newest LLMs (GPT-5 and Claude 3.
5 series), demonstrated feasibility and exceptionally high clinician trust across 500 real-world surgical cases.
This is the first blinded, multi-model evaluation of next-generation LLMs for cardiac surgery.
While outcome-based prospective trials are still required, these results establish CardiacGPT as a promising real-time co-pilot for cardiac surgeons, with the potential to reduce cognitive load, standardize intraoperative communication, and improve postoperative planning.
Related Results
Magnitude and associated factors of intraoperative cardiac complications among geriatric patients who undergo non-cardiac surgery at public hospitals in the southern region of Ethiopia: a multi-center cross-sectional study in 2022/2023
Magnitude and associated factors of intraoperative cardiac complications among geriatric patients who undergo non-cardiac surgery at public hospitals in the southern region of Ethiopia: a multi-center cross-sectional study in 2022/2023
BackgroundIntraoperative cardiac complications are a common cause of morbidity and mortality in non-cardiac surgery. The risk of these complications increased with the average age ...
Current therapeutic strategies for erectile function recovery after radical prostatectomy – literature review and meta-analysis
Current therapeutic strategies for erectile function recovery after radical prostatectomy – literature review and meta-analysis
Radical prostatectomy is the most commonly performed treatment option for localised prostate cancer. In the last decades the surgical technique has been improved and modified in or...
Autonomy on Trial
Autonomy on Trial
Photo by CHUTTERSNAP on Unsplash
Abstract
This paper critically examines how US bioethics and health law conceptualize patient autonomy, contrasting the rights-based, individualist...
Clinical application of ultrafast channel cardiac anesthesia assisted by serratus anterior plane block in right-thoracoscopic minimally invasive cardiac surgery: a retrospective cohort study
Clinical application of ultrafast channel cardiac anesthesia assisted by serratus anterior plane block in right-thoracoscopic minimally invasive cardiac surgery: a retrospective cohort study
Objectives:
This study aimed to investigate the effects of
ultrafast channel cardiac anesthesia assisted by serratus anterior plane
block on the post-operative re...
Cardiac Myxoma Post-Transseptal Ablation: Coincidence or Causation?
Cardiac Myxoma Post-Transseptal Ablation: Coincidence or Causation?
Background: Cardiac myxomas are benign cardiac neoplasms usually found solitarily located within a single cardiac chamber, most commonly in the left atrium. With no established cau...
Risk factors and outcomes associated with unplanned intraoperative extubation of the pediatric surgical patient: An analysis of the NSQIP‐P database
Risk factors and outcomes associated with unplanned intraoperative extubation of the pediatric surgical patient: An analysis of the NSQIP‐P database
AbstractBackgroundUnplanned intraoperative extubation is a rare but potentially catastrophic safety event. Inadvertent extubation in the neonatal and pediatric critical care settin...
Intraoperative imaging, navigation and monitoring
Intraoperative imaging, navigation and monitoring
<p dir="ltr">Neurosurgery involves significant risks due to the intricate anatomy and delicate structures of the brain and spine. Developing techniques to improve surgical ac...
Intraoperative imaging, navigation and monitoring
Intraoperative imaging, navigation and monitoring
<p dir="ltr">Neurosurgery involves significant risks due to the intricate anatomy and delicate structures of the brain and spine. Developing techniques to improve surgical ac...

