Javascript must be enabled to continue!
A Chinese telemedicine-dialogue dataset annotated for named entities
View through CrossRef
Abstract
Background: A large collection of dialogues between patients and doctors are needed to be annotated for medical named entities to build intelligence for telemedicine. However, since most patients involved in telemedicine deliver related named entities in an informal and sentence-level multi-word expression way, it is challenging to tag them on the data of telemedicine dialogues. Under such circumstance, this study aims to address this issue.
Methods: On the data of telemedicine dialogues from Haodf, we have developed guidelines and followed two-round procedure to tag six types of named entities, including disease, symptom, time, pharmaceutical, operation, and examination. Moreover, we have experimented four deep-learning models on the dataset to establish a benchmark for named entity recognition.
Results: The distilled dataset contains 2,383 consultations between doctors and patients, 13,411 sentences from doctors, 17,929 from patients. The average characters per consultation is 1,100. There is 63,560 named entities on the whole, and average characters per named entity is 4.33.Moreover, the experiment results suggest that LatticeLSTM performs best on our dataset regarding all scores like accuracy, precision, F1, etc.
Conclusion: Compared with other exiting datasets, the novelties of this dataset are reflected in three facets: First, the intricated tagging of long multi-word expressions for medical named entity has been tackled in this study. Second, it is one of first attempts to mark temporal entities. Third, this dataset is balanced across the six types of labels. We believe that this dataset will play a considerable role in expanding telemedicine AI.
Title: A Chinese telemedicine-dialogue dataset annotated for named entities
Description:
Abstract
Background: A large collection of dialogues between patients and doctors are needed to be annotated for medical named entities to build intelligence for telemedicine.
However, since most patients involved in telemedicine deliver related named entities in an informal and sentence-level multi-word expression way, it is challenging to tag them on the data of telemedicine dialogues.
Under such circumstance, this study aims to address this issue.
Methods: On the data of telemedicine dialogues from Haodf, we have developed guidelines and followed two-round procedure to tag six types of named entities, including disease, symptom, time, pharmaceutical, operation, and examination.
Moreover, we have experimented four deep-learning models on the dataset to establish a benchmark for named entity recognition.
Results: The distilled dataset contains 2,383 consultations between doctors and patients, 13,411 sentences from doctors, 17,929 from patients.
The average characters per consultation is 1,100.
There is 63,560 named entities on the whole, and average characters per named entity is 4.
33.
Moreover, the experiment results suggest that LatticeLSTM performs best on our dataset regarding all scores like accuracy, precision, F1, etc.
Conclusion: Compared with other exiting datasets, the novelties of this dataset are reflected in three facets: First, the intricated tagging of long multi-word expressions for medical named entity has been tackled in this study.
Second, it is one of first attempts to mark temporal entities.
Third, this dataset is balanced across the six types of labels.
We believe that this dataset will play a considerable role in expanding telemedicine AI.
Related Results
Perceptions of Telemedicine and Rural Healthcare Access in a Developing Country: A Case Study of Bayelsa State, Nigeria
Perceptions of Telemedicine and Rural Healthcare Access in a Developing Country: A Case Study of Bayelsa State, Nigeria
Abstract
Introduction
Telemedicine is the remote delivery of healthcare services using information and communication technologies and has gained global recognition as a solution to...
Evaluating Clinical Outcomes and Physician Adoption of Telemedicine for Chronic Disease Management: Population-Based Retrospective Cohort Study (Preprint)
Evaluating Clinical Outcomes and Physician Adoption of Telemedicine for Chronic Disease Management: Population-Based Retrospective Cohort Study (Preprint)
BACKGROUND
In recent years, the use and impact of telemedicine for providing health care services to patients has increased, reducing the requirement for ph...
Telemedicine Patient Satisfaction Dimensions Moderated by Patient Demographics
Telemedicine Patient Satisfaction Dimensions Moderated by Patient Demographics
Background: A multi-dimensional telemedicine patient satisfaction measure is utilized to provide managerial insights into where service improvements are needed and factors that imp...
Telemedicine perception and interest among medical students at the University of Sharjah, United Arab Emirates, 2023
Telemedicine perception and interest among medical students at the University of Sharjah, United Arab Emirates, 2023
Abstract
Background
Telemedicine is becoming an integral part of healthcare. Training medical students in telemedicine is encouraged by many medical...
Telemedicine and telehealth
Telemedicine and telehealth
Regular hospital visits can be expensive due to travel costs, especially in rural areas. Fortunately, when telemedicine services are used through video conferencing or other virtua...
O005. Telemedicine in Juvenile Idiopathic Arthritis in the era of COVID-19
O005. Telemedicine in Juvenile Idiopathic Arthritis in the era of COVID-19
Abstract
Background
With the COVID-19 pandemic, health care systems are facing challenges in delivering proper patient care. Chi...
Acceptance and adoption determinants of telemedicine in public healthcare institutions
Acceptance and adoption determinants of telemedicine in public healthcare institutions
Background: One of the challenges facing the usage of telemedicine technology in South Africa, particularly in the North West province (NWP), is lack of user acceptance by health c...
Key data elements for a successful pediatric rheumatology virtual visit: a survey within the PR-COIN network
Key data elements for a successful pediatric rheumatology virtual visit: a survey within the PR-COIN network
IntroductionJuvenile idiopathic arthritis (JIA) is the most common childhood rheumatic disease which is commonly monitored by a combination of history, physical examination, bloodw...

