Javascript must be enabled to continue!
Performance assessment of large language models in cancer staging: Comparative analysis of Mistral models
View through CrossRef
ABSTRACT
Cancer staging plays a critical role in treatment planning and prognosis but is often embedded in unstructured clinical narratives. To automate the extraction and structuring of staging data, large language models (LLMs) have emerged as a promising approach. However, their performance in real-world oncology settings has yet to be systematically evaluated. Herein, we analysed 1000 oncological summaries from patients receiving treatment for breast cancer between 2019 and 2020 at the François Baclesse Comprehensive Cancer Centre, France. Five Mistral artificial intelligence–based LLMs were evaluated (i.e. Small, Medium, Large, Magistral and Mistral:latest) for their ability to derive the cancer stage and identify staging elements. Larger models outperformed their smaller counterparts in staging accuracy and reproducibility (kappa > 0.95 for Mistral Large and Medium). Mistral Large achieved the highest accuracy in deriving the cancer stage (93.0%), surpassing the original clinical documentation in several cases. The LLMs consistently performed better in deriving the cancer stage when working through tumour size, nodal status and metastatic components compared to when they were directly requested stage data. The top-performing models had a test–retest reliability exceeding 97%, while smaller models and locally deployed versions lacked sufficient robustness, particularly in handling unit conversions and complex staging rules. The structured, stepwise use of LLMs that emulates clinician reasoning offers a more efficient, transparent and reproducible approach to cancer staging, and the study findings support LLM integration into digital oncology workflows.
Cold Spring Harbor Laboratory
Title: Performance assessment of large language models in cancer staging: Comparative analysis of Mistral models
Description:
ABSTRACT
Cancer staging plays a critical role in treatment planning and prognosis but is often embedded in unstructured clinical narratives.
To automate the extraction and structuring of staging data, large language models (LLMs) have emerged as a promising approach.
However, their performance in real-world oncology settings has yet to be systematically evaluated.
Herein, we analysed 1000 oncological summaries from patients receiving treatment for breast cancer between 2019 and 2020 at the François Baclesse Comprehensive Cancer Centre, France.
Five Mistral artificial intelligence–based LLMs were evaluated (i.
e.
Small, Medium, Large, Magistral and Mistral:latest) for their ability to derive the cancer stage and identify staging elements.
Larger models outperformed their smaller counterparts in staging accuracy and reproducibility (kappa > 0.
95 for Mistral Large and Medium).
Mistral Large achieved the highest accuracy in deriving the cancer stage (93.
0%), surpassing the original clinical documentation in several cases.
The LLMs consistently performed better in deriving the cancer stage when working through tumour size, nodal status and metastatic components compared to when they were directly requested stage data.
The top-performing models had a test–retest reliability exceeding 97%, while smaller models and locally deployed versions lacked sufficient robustness, particularly in handling unit conversions and complex staging rules.
The structured, stepwise use of LLMs that emulates clinician reasoning offers a more efficient, transparent and reproducible approach to cancer staging, and the study findings support LLM integration into digital oncology workflows.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Primerjalna književnost na prelomu tisočletja
Primerjalna književnost na prelomu tisočletja
In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...
Predictors of False-Negative Axillary FNA Among Breast Cancer Patients: A Cross-Sectional Study
Predictors of False-Negative Axillary FNA Among Breast Cancer Patients: A Cross-Sectional Study
Abstract
Introduction
Fine-needle aspiration (FNA) is commonly used to investigate lymphadenopathy of suspected metastatic origin. The current study aims to find the association be...
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Abstract
A cervical rib (CR), also known as a supernumerary or extra rib, is an additional rib that forms above the first rib, resulting from the overgrowth of the transverse proce...
Edoxaban and Cancer-Associated Venous Thromboembolism: A Meta-analysis of Clinical Trials
Edoxaban and Cancer-Associated Venous Thromboembolism: A Meta-analysis of Clinical Trials
Abstract
Introduction
Cancer patients face a venous thromboembolism (VTE) risk that is up to 50 times higher compared to individuals without cancer. In 2010, direct oral anticoagul...
Aviation English - A global perspective: analysis, teaching, assessment
Aviation English - A global perspective: analysis, teaching, assessment
This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...
The prognostic impact of surgical staging procedures in patients with colorectal and appendiceal peritoneal metastases undergoing CRS-HIPEC
The prognostic impact of surgical staging procedures in patients with colorectal and appendiceal peritoneal metastases undergoing CRS-HIPEC
Abstract
Background
Surgical staging procedures are used to select patients with peritoneal metastases for surgery. We aimed to evaluate the prognostic impact of surgical ...
COMPARISON OF OPEN VS. MINIMALLY INVASIVE SURGICAL STAGING IN EARLY-STAGE OVARIAN CANCER
COMPARISON OF OPEN VS. MINIMALLY INVASIVE SURGICAL STAGING IN EARLY-STAGE OVARIAN CANCER
Background: Early-stage ovarian cancer is a critical area of study, where accurate staging plays a vital role in determining prognosis and treatment strategies. Surgical staging is...

