Javascript must be enabled to continue!
Optimizing Clinical Trial Eligibility Design Using Natural Language Processing Models and Real-World Data: Algorithm Development and Validation
View through CrossRef
Background
Clinical trials are vital for developing new therapies but can also delay drug development. Efficient trial data management, optimized trial protocol, and accurate patient identification are critical for reducing trial timelines. Natural language processing (NLP) has the potential to achieve these objectives.
Objective
This study aims to assess the feasibility of using data-driven approaches to optimize clinical trial protocol design and identify eligible patients. This involves creating a comprehensive eligibility criteria knowledge base integrated within electronic health records using deep learning–based NLP techniques.
Methods
We obtained data of 3281 industry-sponsored phase 2 or 3 interventional clinical trials recruiting patients with non–small cell lung cancer, prostate cancer, breast cancer, multiple myeloma, ulcerative colitis, and Crohn disease from ClinicalTrials.gov, spanning the period between 2013 and 2020. A customized bidirectional long short-term memory– and conditional random field–based NLP pipeline was used to extract all eligibility criteria attributes and convert hypernym concepts into computable hyponyms along with their corresponding values. To illustrate the simulation of clinical trial design for optimization purposes, we selected a subset of patients with non–small cell lung cancer (n=2775), curated from the Mount Sinai Health System, as a pilot study.
Results
We manually annotated the clinical trial eligibility corpus (485/3281, 14.78% trials) and constructed an eligibility criteria–specific ontology. Our customized NLP pipeline, developed based on the eligibility criteria–specific ontology that we created through manual annotation, achieved high precision (0.91, range 0.67-1.00) and recall (0.79, range 0.50-1) scores, as well as a high F1-score (0.83, range 0.67-1), enabling the efficient extraction of granular criteria entities and relevant attributes from 3281 clinical trials. A standardized eligibility criteria knowledge base, compatible with electronic health records, was developed by transforming hypernym concepts into machine-interpretable hyponyms along with their corresponding values. In addition, an interface prototype demonstrated the practicality of leveraging real-world data for optimizing clinical trial protocols and identifying eligible patients.
Conclusions
Our customized NLP pipeline successfully generated a standardized eligibility criteria knowledge base by transforming hypernym criteria into machine-readable hyponyms along with their corresponding values. A prototype interface integrating real-world patient information allows us to assess the impact of each eligibility criterion on the number of patients eligible for the trial. Leveraging NLP and real-world data in a data-driven approach holds promise for streamlining the overall clinical trial process, optimizing processes, and improving efficiency in patient identification.
JMIR Publications Inc.
Title: Optimizing Clinical Trial Eligibility Design Using Natural Language Processing Models and Real-World Data: Algorithm Development and Validation
Description:
Background
Clinical trials are vital for developing new therapies but can also delay drug development.
Efficient trial data management, optimized trial protocol, and accurate patient identification are critical for reducing trial timelines.
Natural language processing (NLP) has the potential to achieve these objectives.
Objective
This study aims to assess the feasibility of using data-driven approaches to optimize clinical trial protocol design and identify eligible patients.
This involves creating a comprehensive eligibility criteria knowledge base integrated within electronic health records using deep learning–based NLP techniques.
Methods
We obtained data of 3281 industry-sponsored phase 2 or 3 interventional clinical trials recruiting patients with non–small cell lung cancer, prostate cancer, breast cancer, multiple myeloma, ulcerative colitis, and Crohn disease from ClinicalTrials.
gov, spanning the period between 2013 and 2020.
A customized bidirectional long short-term memory– and conditional random field–based NLP pipeline was used to extract all eligibility criteria attributes and convert hypernym concepts into computable hyponyms along with their corresponding values.
To illustrate the simulation of clinical trial design for optimization purposes, we selected a subset of patients with non–small cell lung cancer (n=2775), curated from the Mount Sinai Health System, as a pilot study.
Results
We manually annotated the clinical trial eligibility corpus (485/3281, 14.
78% trials) and constructed an eligibility criteria–specific ontology.
Our customized NLP pipeline, developed based on the eligibility criteria–specific ontology that we created through manual annotation, achieved high precision (0.
91, range 0.
67-1.
00) and recall (0.
79, range 0.
50-1) scores, as well as a high F1-score (0.
83, range 0.
67-1), enabling the efficient extraction of granular criteria entities and relevant attributes from 3281 clinical trials.
A standardized eligibility criteria knowledge base, compatible with electronic health records, was developed by transforming hypernym concepts into machine-interpretable hyponyms along with their corresponding values.
In addition, an interface prototype demonstrated the practicality of leveraging real-world data for optimizing clinical trial protocols and identifying eligible patients.
Conclusions
Our customized NLP pipeline successfully generated a standardized eligibility criteria knowledge base by transforming hypernym criteria into machine-readable hyponyms along with their corresponding values.
A prototype interface integrating real-world patient information allows us to assess the impact of each eligibility criterion on the number of patients eligible for the trial.
Leveraging NLP and real-world data in a data-driven approach holds promise for streamlining the overall clinical trial process, optimizing processes, and improving efficiency in patient identification.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Abstract 6455: Investigating clinical trial eligibility criteria to improve MatchMiner trial matching
Abstract 6455: Investigating clinical trial eligibility criteria to improve MatchMiner trial matching
Abstract
As the number of precision medicine (PM) trials and the volume of patient genomic data have grown, it has become challenging for clinicians and trial staff ...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Small Cell Lung Cancer and Tarlatamab: A Meta-Analysis of Clinical Trials
Small Cell Lung Cancer and Tarlatamab: A Meta-Analysis of Clinical Trials
Abstract
Introduction
Tarlatamab is a Delta-like ligand 3 (DLL3) -directed bispecific T-cell engager recently approved for use in patients with advanced small cell lung cancer (SCL...
Abstract P160: MatchMiner: An open-source platform for cancer precision medicine
Abstract P160: MatchMiner: An open-source platform for cancer precision medicine
Abstract
With the advent of next generation sequencing in cancer care, patients’ tumors can be genomically profiled and specific genetic alterations can be targeted ...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
International Breast Cancer Study Group (IBCSG)
International Breast Cancer Study Group (IBCSG)
This section provides current contact details and a summary of recent or ongoing clinical trials being coordinated by International Breast Cancer Study Group (IBCSG). Clinical tria...

