Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study (Preprint)

View through CrossRef
BACKGROUND Accurate identification of new diagnoses of human papillomavirus–associated cancers and precancers is an important step toward the development of strategies that optimize the use of human papillomavirus vaccines. The diagnosis of human papillomavirus cancers hinges on a histopathologic report, which is typically stored in electronic medical records as free-form, or unstructured, narrative text. Previous efforts to perform surveillance for human papillomavirus cancers have relied on the manual review of pathology reports to extract diagnostic information, a process that is both labor- and resource-intensive. Natural language processing can be used to automate the structuring and extraction of clinical data from unstructured narrative text in medical records and may provide a practical and effective method for identifying patients with vaccine-preventable human papillomavirus disease for surveillance and research. OBJECTIVE This study's objective was to develop and assess the accuracy of a natural language processing algorithm for the identification of individuals with cancer or precancer of the cervix and anus. METHODS A pipeline-based natural language processing algorithm was developed, which incorporated machine learning and rule-based methods to extract diagnostic elements from the narrative pathology reports. To test the algorithm’s classification accuracy, we used a split-validation study design. Full-length cervical and anal pathology reports were randomly selected from 4 clinical pathology laboratories. Two study team members, blinded to the classifications produced by the natural language processing algorithm, manually and independently reviewed all reports and classified them at the document level according to 2 domains (diagnosis and human papillomavirus testing results). Using the manual review as the gold standard, the algorithm’s performance was evaluated using standard measurements of accuracy, recall, precision, and F-measure. RESULTS The natural language processing algorithm’s performance was validated on 949 pathology reports. The algorithm demonstrated accurate identification of abnormal cytology, histology, and positive human papillomavirus tests with accuracies greater than 0.91. Precision was lowest for anal histology reports (0.87, 95% CI 0.59-0.98) and highest for cervical cytology (0.98, 95% CI 0.95-0.99). The natural language processing algorithm missed 2 out of the 15 abnormal anal histology reports, which led to a relatively low recall (0.68, 95% CI 0.43-0.87). CONCLUSIONS This study outlines the development and validation of a freely available and easily implementable natural language processing algorithm that can automate the extraction and classification of clinical data from cervical and anal cytology and histology.
Title: Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study (Preprint)
Description:
BACKGROUND Accurate identification of new diagnoses of human papillomavirus–associated cancers and precancers is an important step toward the development of strategies that optimize the use of human papillomavirus vaccines.
The diagnosis of human papillomavirus cancers hinges on a histopathologic report, which is typically stored in electronic medical records as free-form, or unstructured, narrative text.
Previous efforts to perform surveillance for human papillomavirus cancers have relied on the manual review of pathology reports to extract diagnostic information, a process that is both labor- and resource-intensive.
Natural language processing can be used to automate the structuring and extraction of clinical data from unstructured narrative text in medical records and may provide a practical and effective method for identifying patients with vaccine-preventable human papillomavirus disease for surveillance and research.
OBJECTIVE This study's objective was to develop and assess the accuracy of a natural language processing algorithm for the identification of individuals with cancer or precancer of the cervix and anus.
METHODS A pipeline-based natural language processing algorithm was developed, which incorporated machine learning and rule-based methods to extract diagnostic elements from the narrative pathology reports.
To test the algorithm’s classification accuracy, we used a split-validation study design.
Full-length cervical and anal pathology reports were randomly selected from 4 clinical pathology laboratories.
Two study team members, blinded to the classifications produced by the natural language processing algorithm, manually and independently reviewed all reports and classified them at the document level according to 2 domains (diagnosis and human papillomavirus testing results).
Using the manual review as the gold standard, the algorithm’s performance was evaluated using standard measurements of accuracy, recall, precision, and F-measure.
RESULTS The natural language processing algorithm’s performance was validated on 949 pathology reports.
The algorithm demonstrated accurate identification of abnormal cytology, histology, and positive human papillomavirus tests with accuracies greater than 0.
91.
Precision was lowest for anal histology reports (0.
87, 95% CI 0.
59-0.
98) and highest for cervical cytology (0.
98, 95% CI 0.
95-0.
99).
The natural language processing algorithm missed 2 out of the 15 abnormal anal histology reports, which led to a relatively low recall (0.
68, 95% CI 0.
43-0.
87).
CONCLUSIONS This study outlines the development and validation of a freely available and easily implementable natural language processing algorithm that can automate the extraction and classification of clinical data from cervical and anal cytology and histology.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Cervical cancer screening utilization and predictors among eligible women in Ethiopia: A systematic review and meta-analysis
Cervical cancer screening utilization and predictors among eligible women in Ethiopia: A systematic review and meta-analysis
BackgroundDespite a remarkable progress in the reduction of global rate of maternal mortality, cervical cancer has been identified as the leading cause of maternal morbidity and mo...
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Abstract A cervical rib (CR), also known as a supernumerary or extra rib, is an additional rib that forms above the first rib, resulting from the overgrowth of the transverse proce...
Infracoccygeal/transperineal window: new method to prenatally diagnose and classify level of anal atresia
Infracoccygeal/transperineal window: new method to prenatally diagnose and classify level of anal atresia
ABSTRACTObjectivesTo introduce a two‐dimensional sonographic method to assess the fetal anus, and to evaluate the feasibility of this method to diagnose anal atresia prenatally and...
Preliminary study on the pathogenesis of anal fistula
Preliminary study on the pathogenesis of anal fistula
ABSTRACT BACKGROUND Anal gland infection is one of the main pathogenic factors of anal fistula. The anal gland is mainly consis...
Cervical Cancer or Cervical Endometriosis – Review and Case Report
Cervical Cancer or Cervical Endometriosis – Review and Case Report
According to cancer death rates for women worldwide, this form of cancer ranks fourth after breast, bronchopulmonary, and colorectal cancer, affecting around 570,000 women annually...
C/EBPβ expression decreases in cervical cancer and leads to tumorigenesis
C/EBPβ expression decreases in cervical cancer and leads to tumorigenesis
Abstract Background Cervical cancer is currently estimated to be the fourth most common cancer among women worldwide and the leading cause of cancer...

Back to Top