Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Fine-Tuning for Accuracy: Evaluation of GPT for Automatic Assignment of ICD Codes to Clinical Documentation

View through CrossRef
Abstract Background: Assignment of International Classification of Disease (ICD) codes to clinical documentation is a tedious but important task that is mostly done manually. This study evaluated the widely popular OpenAI’s Generative Pretrained Model (GPT) 3.5 Turbo in facilitating the automation of assigning ICD codes to clinical notes. Methods: We identified the 10 most prevalent ICD-10 codes in the Medical Information Mart for Intensive Care (MIMIC-IV) dataset. We selected 200 notes for each code, and then split them equally into two groups of 100 each (randomly selected) for training and testing. We then passed each note to GPT 3.5 Turbo via OpenAI’s API, prompting the model to assign ICD-10 codes to each note. We evaluated the model’s response for the presence of the target ICD-10 code. After fine-tuning the GPT model on the training data, we repeated the process with the test data, comparing the fine-tuned model’s performance against the default model. Results: Initially the target ICD-10 code was present in the assigned codes by the default GPT 3.5 Turbo model in 29.7% of the cases. After fine-tuning with 100 notes for each top code, the accuracy improved to 62.6%. Conclusions: Historically, GPT’s performance for healthcare related tasks is sub-optimal. Fine-tuning as in this study provides great potential for improved performance, highlighting a path forward for integration of Artificial Intelligence (AI) in healthcare for improved efficiency and accuracy of this administrative task. Future research should focus on expanding the training datasets with specialized data and exploring the potential integration of these models into existing healthcare systems to maximize their utility and reliability.
Title: Fine-Tuning for Accuracy: Evaluation of GPT for Automatic Assignment of ICD Codes to Clinical Documentation
Description:
Abstract Background: Assignment of International Classification of Disease (ICD) codes to clinical documentation is a tedious but important task that is mostly done manually.
This study evaluated the widely popular OpenAI’s Generative Pretrained Model (GPT) 3.
5 Turbo in facilitating the automation of assigning ICD codes to clinical notes.
Methods: We identified the 10 most prevalent ICD-10 codes in the Medical Information Mart for Intensive Care (MIMIC-IV) dataset.
We selected 200 notes for each code, and then split them equally into two groups of 100 each (randomly selected) for training and testing.
We then passed each note to GPT 3.
5 Turbo via OpenAI’s API, prompting the model to assign ICD-10 codes to each note.
We evaluated the model’s response for the presence of the target ICD-10 code.
After fine-tuning the GPT model on the training data, we repeated the process with the test data, comparing the fine-tuned model’s performance against the default model.
Results: Initially the target ICD-10 code was present in the assigned codes by the default GPT 3.
5 Turbo model in 29.
7% of the cases.
After fine-tuning with 100 notes for each top code, the accuracy improved to 62.
6%.
Conclusions: Historically, GPT’s performance for healthcare related tasks is sub-optimal.
Fine-tuning as in this study provides great potential for improved performance, highlighting a path forward for integration of Artificial Intelligence (AI) in healthcare for improved efficiency and accuracy of this administrative task.
Future research should focus on expanding the training datasets with specialized data and exploring the potential integration of these models into existing healthcare systems to maximize their utility and reliability.

Related Results

The Effect of Clinical Knee Measurement in Children with Genu Varus
The Effect of Clinical Knee Measurement in Children with Genu Varus
Abstract Introduction Children with genu varus needs frequent assessment and follow up that may need several radiographies. This study investigates the effectiveness of the clinica...
Analisis Penggunaan GPT dalam Pembelajaran Klinik Optik I di ARO Gapopin
Analisis Penggunaan GPT dalam Pembelajaran Klinik Optik I di ARO Gapopin
Perkembangan teknologi kecerdasan buatan (Artificial Intelligence/AI), khususnya model bahasa besar seperti Generative Pre-trained Transformer (GPT), telah membawa transformasi bes...
Clinical outcomes of subcutaneous vs. transvenous implantable defibrillator therapy in a polymorbid patient cohort
Clinical outcomes of subcutaneous vs. transvenous implantable defibrillator therapy in a polymorbid patient cohort
BackgroundThe subcutaneous implantable cardioverter-defibrillator (S-ICD) has been designed to overcome lead-related complications and device endocarditis. Lacking the ability for ...

Back to Top