Javascript must be enabled to continue!
Development of a Custom Spell-Checker for Emergency Department Data
View through CrossRef
ObjectiveTo share progress on a custom spell-checker for emergency department chief complaint free-text data and demonstrate a spell-checker validation Shiny application.IntroductionEmergency department (ED) syndromic surveillance relies on a chief complaint, which is often a free-text field, and may contain misspelled words, syntactic errors, and healthcare-specific and/or facility-specific abbreviations. Cleaning of the chief complaint field may improve syndrome capture sensitivity and reduce misclassification of syndromes. We are building a spell-checker, customized with language found in ED corpora, as our first step in cleaning our chief complaint field. This exercise would elucidate the value of pre-processing text and would lend itself to future work using natural language processing (NLP) techniques, such as topic modeling. Such a tool could be extensible to other datasets that contain free-text fields, including electronic reportable disease lab and case reporting.MethodsChief complaints may contain words that are incorrect if they are misspelled (e.g.,“patient has herpertension”), or, if the word yields a syntactically incorrect phrase (e.g., the word “huts” in the phrase: “my toe huts”).We are developing a spell-checker tool for chief complaint text using the R and Python programming languages. The first stage in the development of the spell-checker is the identifying and handling of misspellings; future work will address syntactic errors. Known abbreviations are identified using regular expressions, and unknown abbreviations are addressed by the spell-checker. The spell checker performs 4 steps on chief complaint data: identification of misspellings, generation of a substitute candidate word list, word sense disambiguation to identify replacement word, and replacement of the misspelled word, based on methods found in the literature.[1] As the spell-checker requires a dictionary of correctly spelled, healthcare-specific terms including all terms that would appear in an ED corpus, we used vocabularies from the Unified Medical Language System, ED-specific terminology, and domain expert user input. Dictionary construction, misspelling identification algorithms, and word list generation algorithms are in the development stage.Simultaneously, we are building an R Shiny interactive web application for syndromic surveillance analysts to manually correct a subset of misspelled words, which we will use to validate and evaluate the performance of the spell-checker tool.[1] Tolentino HD, Matters MD, Walop W, et al. A UMLS-based spell checker for natural language processing in vaccine safety. BMC Medical Informatics and Decision Making. 2007;7(1). doi:10.1186/1472-6947-7-3.ResultsProject still in development phase.ConclusionsThe audience will learn about important considerations for developing a spell-checker, including those for data structure of a dictionary and algorithms for identification of misplaced words and identification of candidate replacement words. We will demonstrate our word list generation algorithm and the Shiny application which uses these words for spell-checker validation. We will share relevant code; after our presentation, audience members should able to apply code and lessons to their own projects and/or to collaborate with the NYC Department of Health and Mental Hygiene.
University of Illinois Libraries
Title: Development of a Custom Spell-Checker for Emergency Department Data
Description:
ObjectiveTo share progress on a custom spell-checker for emergency department chief complaint free-text data and demonstrate a spell-checker validation Shiny application.
IntroductionEmergency department (ED) syndromic surveillance relies on a chief complaint, which is often a free-text field, and may contain misspelled words, syntactic errors, and healthcare-specific and/or facility-specific abbreviations.
Cleaning of the chief complaint field may improve syndrome capture sensitivity and reduce misclassification of syndromes.
We are building a spell-checker, customized with language found in ED corpora, as our first step in cleaning our chief complaint field.
This exercise would elucidate the value of pre-processing text and would lend itself to future work using natural language processing (NLP) techniques, such as topic modeling.
Such a tool could be extensible to other datasets that contain free-text fields, including electronic reportable disease lab and case reporting.
MethodsChief complaints may contain words that are incorrect if they are misspelled (e.
g.
,“patient has herpertension”), or, if the word yields a syntactically incorrect phrase (e.
g.
, the word “huts” in the phrase: “my toe huts”).
We are developing a spell-checker tool for chief complaint text using the R and Python programming languages.
The first stage in the development of the spell-checker is the identifying and handling of misspellings; future work will address syntactic errors.
Known abbreviations are identified using regular expressions, and unknown abbreviations are addressed by the spell-checker.
The spell checker performs 4 steps on chief complaint data: identification of misspellings, generation of a substitute candidate word list, word sense disambiguation to identify replacement word, and replacement of the misspelled word, based on methods found in the literature.
[1] As the spell-checker requires a dictionary of correctly spelled, healthcare-specific terms including all terms that would appear in an ED corpus, we used vocabularies from the Unified Medical Language System, ED-specific terminology, and domain expert user input.
Dictionary construction, misspelling identification algorithms, and word list generation algorithms are in the development stage.
Simultaneously, we are building an R Shiny interactive web application for syndromic surveillance analysts to manually correct a subset of misspelled words, which we will use to validate and evaluate the performance of the spell-checker tool.
[1] Tolentino HD, Matters MD, Walop W, et al.
A UMLS-based spell checker for natural language processing in vaccine safety.
BMC Medical Informatics and Decision Making.
2007;7(1).
doi:10.
1186/1472-6947-7-3.
ResultsProject still in development phase.
ConclusionsThe audience will learn about important considerations for developing a spell-checker, including those for data structure of a dictionary and algorithms for identification of misplaced words and identification of candidate replacement words.
We will demonstrate our word list generation algorithm and the Shiny application which uses these words for spell-checker validation.
We will share relevant code; after our presentation, audience members should able to apply code and lessons to their own projects and/or to collaborate with the NYC Department of Health and Mental Hygiene.
Related Results
AI Open research Plagiarism Dupli Checker, Scribbr Plagiarism Checker, Quetext, Small SEO Tools Plagiarism Checker Web Technology: comparative study
AI Open research Plagiarism Dupli Checker, Scribbr Plagiarism Checker, Quetext, Small SEO Tools Plagiarism Checker Web Technology: comparative study
Purpose
This paper mainly aims to explore the AI Open research Plagiarism Dupli Checker, Scribbr Plagiarism Checker, Quetext and Small SEO Tools Plagiarism Checker and provides a c...
KEUNIKAN BAHASA MANTRA BANJAR: PANAH ARJUNA (The Uniqueness of the Expression of Banjarese Spell:Panah Arjuna)
KEUNIKAN BAHASA MANTRA BANJAR: PANAH ARJUNA (The Uniqueness of the Expression of Banjarese Spell:Panah Arjuna)
Mantra Panah Arjuna adalah salah satu mantra Banjar berupa mantra cinta untuk menundukkan hati seseorang yang dicintai. Lazimnya mantra ini dipergunakan oleh laki-laki untuk menak...
Design a Low Power and High Speed Parity Checker using Exclusive or Gates
Design a Low Power and High Speed Parity Checker using Exclusive or Gates
In the presented paper we designed the parity checker by using EX-OR modules. The two EX-OR modules are presented to design the parity checker and correlated their outcomes based o...
Custom order entry for Parkinson’s medications in the hospital improves timely administration: an analysis of over 31,000 medication doses
Custom order entry for Parkinson’s medications in the hospital improves timely administration: an analysis of over 31,000 medication doses
BackgroundPatients with Parkinson’s disease (PD) are at increased risk for hospital acquired complications. Deviations from home medication schedules and delays in administration a...
ANALISIS MANTRA PESTA PANEN ADAT LOMPLAI SUKU DAYAK WEHEA DI DESA NEHAS LIAH BING KABUPATEN KUTAI TIMUR KAJIAN SEMIOTIKA
ANALISIS MANTRA PESTA PANEN ADAT LOMPLAI SUKU DAYAK WEHEA DI DESA NEHAS LIAH BING KABUPATEN KUTAI TIMUR KAJIAN SEMIOTIKA
Aslam Cahya Putra, Kiftiawati, PurwantiProgram Studi Sastra Indonesia, Fakultas Ilmu BudayaUniversitas MulawarmanEmail: aslamr074@gmail.com ABSTRAKKata kunci: suku dayak wehea, ma...
Online symptom checker diagnostic and triage accuracy for HIV and hepatitis C
Online symptom checker diagnostic and triage accuracy for HIV and hepatitis C
AbstractWe sought to address the prior limitations of symptom checker accuracy by analysing the diagnostic and triage feasibility of online symptom checkers using a consecutive ser...
Makna Mantra Melaut Suku Bajo
Makna Mantra Melaut Suku Bajo
Mantra adalah salah satu genre puisi lama yang pembacaannya dimaksudkan untuk menimbulkan efek magis atau kekuatan tertentu. Mantra, dalam pandangan masyarakat Bajo, diyakini dapat...
FUNGSI SOSIOBUDAYA RUMAH ADAT TONGKONAN SUKU TORAJA DI LALIKAN PANGALA’, TORAJA UTARA, SULAWESI SELATAN, INDONESIA
FUNGSI SOSIOBUDAYA RUMAH ADAT TONGKONAN SUKU TORAJA DI LALIKAN PANGALA’, TORAJA UTARA, SULAWESI SELATAN, INDONESIA
ABSTRAK
Rumah Adat Tongkonan merupakan warisan budaya yang dimiliki Suku Toraja di Indonesia dimana ianya menjadi pusat kehidupan sebagai rumah adat yang multifungsi. Secara um...

