Javascript must be enabled to continue!
Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records
View through CrossRef
In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.
Title: Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records
Description:
In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text.
We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints.
Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies.
We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm.
We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra.
We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm.
Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound.
Updating time is kept logarithmic and the space requirement is kept linear at the same time.
We also discuss interval management in external memory models and higher dimensions.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Gene function finding through cross-organism ensemble learning
Gene function finding through cross-organism ensemble learning
Abstract
Background
Structured biological information about genes and proteins is a valuable resource to improve discovery and understanding of comp...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
Graph-based Interactive Bibliographic Information Retrieval Systems
Graph-based Interactive Bibliographic Information Retrieval Systems
In the big data era, we have witnessed the explosion of scholarly literature. This explosion has imposed challenges to the retrieval of bibliographic information. Retrieval of inte...
Effects of Stand Structural Characteristics, Diversity, and Stability on Carbon Storage Across Different Densities in Natural Forests: A Case Study in the Xiaolong Mountains, China
Effects of Stand Structural Characteristics, Diversity, and Stability on Carbon Storage Across Different Densities in Natural Forests: A Case Study in the Xiaolong Mountains, China
The carbon storage in forest ecosystems is closely linked to biomass, and its dynamic changes are of significant importance for assessing forest structure and function, as well as ...
Transcript Assembly and Annotations: Bias and Adjustment
Transcript Assembly and Annotations: Bias and Adjustment
AbstractMotivationTranscript annotations play a critical role in gene expression analysis as they serve as a reference for quantifying isoform-level expression. The two main source...
Authenticity and Identity of Electronic Records
Authenticity and Identity of Electronic Records
One of the key features of an electronic record is its authenticity. Ensuring the authenticity of managerial electronic records at all stages of its life cycle from the moment of i...
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract
Introduction
The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...

