Javascript must be enabled to continue!
An LLM-assisted framework for accelerated and verifiable clinical hypothesis testing from electronic health records
View through CrossRef
Acquiring insights from electronic health records (EHRs) is slowed by manual analytical workflows that limit scalability and reproducibility. We present LATCH (LLM-Assisted Testing of Clinical Hypotheses), an agentic framework that converts natural language clinical hypotheses into fully auditable analyses on structured EHR data. LATCH integrates LLM-assisted semantic layers with deterministic execution pipelines to automate cohort construction, statistical analysis, and result reporting, while isolating patient-level data from LLM-involved steps. Using diabetes as a model disease, LATCH reproduced findings from 20 published studies within 3-15 minutes per study. Beyond replication, LATCH enabled study extensions and new insight generation through simple natural language hypothesis modifications. We demonstrated LATCH across 102 hypothesis tests spanning reproduction, extension, and insight generation. We systematically stress-tested LATCH to characterize its limitations and operational boundaries. LATCH provides a scalable framework for reproducible real-world evidence generation, reducing analytical bottlenecks and improving reliability of AI-assisted biomedical discovery while preserving human oversight.
Title: An LLM-assisted framework for accelerated and verifiable clinical hypothesis testing from electronic health records
Description:
Acquiring insights from electronic health records (EHRs) is slowed by manual analytical workflows that limit scalability and reproducibility.
We present LATCH (LLM-Assisted Testing of Clinical Hypotheses), an agentic framework that converts natural language clinical hypotheses into fully auditable analyses on structured EHR data.
LATCH integrates LLM-assisted semantic layers with deterministic execution pipelines to automate cohort construction, statistical analysis, and result reporting, while isolating patient-level data from LLM-involved steps.
Using diabetes as a model disease, LATCH reproduced findings from 20 published studies within 3-15 minutes per study.
Beyond replication, LATCH enabled study extensions and new insight generation through simple natural language hypothesis modifications.
We demonstrated LATCH across 102 hypothesis tests spanning reproduction, extension, and insight generation.
We systematically stress-tested LATCH to characterize its limitations and operational boundaries.
LATCH provides a scalable framework for reproducible real-world evidence generation, reducing analytical bottlenecks and improving reliability of AI-assisted biomedical discovery while preserving human oversight.
Related Results
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract
Introduction
The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...
Automating Information Retrieval from Biodiversity Literature Using Large Language Models: A Case Study
Automating Information Retrieval from Biodiversity Literature Using Large Language Models: A Case Study
Recently, Large Language Models (LLMs) have transformed information retrieval, becoming widely adopted across various domains due to their ability to process extensive textual data...
Human-AI Collaboration in Clinical Reasoning: A UK Replication and Interaction Analysis
Human-AI Collaboration in Clinical Reasoning: A UK Replication and Interaction Analysis
Abstract
Objective
A paper from Goh et al found that a large language model (LLM) working alone outperformed American clinicians assisted...
Unraveling the landscape of large language models: a systematic review and future perspectives
Unraveling the landscape of large language models: a systematic review and future perspectives
PurposeThe rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a compreh...
Financial Advisory LLM Model for Modernizing Financial Services and Innovative Solutions for Financial Literacy in India
Financial Advisory LLM Model for Modernizing Financial Services and Innovative Solutions for Financial Literacy in India
Abstract
Dynamically evolving financial conditions in India place sophisticated models of financial advisory services relative to its own peculiar conditions more in demand...
Hypothesis Testing in Business Administration
Hypothesis Testing in Business Administration
Hypothesis testing is an approach to statistical inference that is routinely taught and used. It is based on a simple idea: develop some relevant speculation about the population o...
Leveraging simulation to provide a practical framework for assessing the novel scope of risk of LLMs in healthcare
Leveraging simulation to provide a practical framework for assessing the novel scope of risk of LLMs in healthcare
Structured Abstract
Background
Large language models (LLMs) are rapidly entering clinical care, yet their definitionally probab...
How Large Language Models Can Affect Clinical Reasoning: A Randomized Clinical Trial
How Large Language Models Can Affect Clinical Reasoning: A Randomized Clinical Trial
Abstract
Importance
LLMs have encoded a vast array of medical knowledge and are being integrated into clinical settings as deci...

