Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A Chinese telemedicine-dialogue dataset annotated for named entities

View through CrossRef
Abstract Background: A large collection of dialogues between patients and doctors are needed to be annotated for medical named entities to build intelligence for telemedicine. However, since most patients involved in telemedicine deliver related named entities in an informal and sentence-level multi-word expression way, it is challenging to tag them on the data of telemedicine dialogues. Under such circumstance, this study aims to address this issue. Methods: On the data of telemedicine dialogues from Haodf, we have developed guidelines and followed two-round procedure to tag six types of named entities, including disease, symptom, time, pharmaceutical, operation, and examination. Moreover, we have experimented four deep-learning models on the dataset to establish a benchmark for named entity recognition. Results: The distilled dataset contains 2,383 consultations between doctors and patients, 13,411 sentences from doctors, 17,929 from patients. The average characters per consultation is 1,100. There is 63,560 named entities on the whole, and average characters per named entity is 4.33.Moreover, the experiment results suggest that LatticeLSTM performs best on our dataset regarding all scores like accuracy, precision, F1, etc. Conclusion: Compared with other exiting datasets, the novelties of this dataset are reflected in three facets: First, the intricated tagging of long multi-word expressions for medical named entity has been tackled in this study. Second, it is one of first attempts to mark temporal entities. Third, this dataset is balanced across the six types of labels. We believe that this dataset will play a considerable role in expanding telemedicine AI.
Title: A Chinese telemedicine-dialogue dataset annotated for named entities
Description:
Abstract Background: A large collection of dialogues between patients and doctors are needed to be annotated for medical named entities to build intelligence for telemedicine.
However, since most patients involved in telemedicine deliver related named entities in an informal and sentence-level multi-word expression way, it is challenging to tag them on the data of telemedicine dialogues.
Under such circumstance, this study aims to address this issue.
Methods: On the data of telemedicine dialogues from Haodf, we have developed guidelines and followed two-round procedure to tag six types of named entities, including disease, symptom, time, pharmaceutical, operation, and examination.
Moreover, we have experimented four deep-learning models on the dataset to establish a benchmark for named entity recognition.
Results: The distilled dataset contains 2,383 consultations between doctors and patients, 13,411 sentences from doctors, 17,929 from patients.
The average characters per consultation is 1,100.
There is 63,560 named entities on the whole, and average characters per named entity is 4.
33.
Moreover, the experiment results suggest that LatticeLSTM performs best on our dataset regarding all scores like accuracy, precision, F1, etc.
Conclusion: Compared with other exiting datasets, the novelties of this dataset are reflected in three facets: First, the intricated tagging of long multi-word expressions for medical named entity has been tackled in this study.
Second, it is one of first attempts to mark temporal entities.
Third, this dataset is balanced across the six types of labels.
We believe that this dataset will play a considerable role in expanding telemedicine AI.

Related Results

Perceptions of Telemedicine and Rural Healthcare Access in a Developing Country: A Case Study of Bayelsa State, Nigeria
Perceptions of Telemedicine and Rural Healthcare Access in a Developing Country: A Case Study of Bayelsa State, Nigeria
Abstract Introduction Telemedicine is the remote delivery of healthcare services using information and communication technologies and has gained global recognition as a solution to...
Telemedicine Patient Satisfaction Dimensions Moderated by Patient Demographics
Telemedicine Patient Satisfaction Dimensions Moderated by Patient Demographics
Background: A multi-dimensional telemedicine patient satisfaction measure is utilized to provide managerial insights into where service improvements are needed and factors that imp...
Telemedicine and telehealth
Telemedicine and telehealth
Regular hospital visits can be expensive due to travel costs, especially in rural areas. Fortunately, when telemedicine services are used through video conferencing or other virtua...
O005. Telemedicine in Juvenile Idiopathic Arthritis in the era of COVID-19
O005. Telemedicine in Juvenile Idiopathic Arthritis in the era of COVID-19
Abstract Background With the COVID-19 pandemic, health care systems are facing challenges in delivering proper patient care. Chi...
Acceptance and adoption determinants of telemedicine in public healthcare institutions
Acceptance and adoption determinants of telemedicine in public healthcare institutions
Background: One of the challenges facing the usage of telemedicine technology in South Africa, particularly in the North West province (NWP), is lack of user acceptance by health c...
Key data elements for a successful pediatric rheumatology virtual visit: a survey within the PR-COIN network
Key data elements for a successful pediatric rheumatology virtual visit: a survey within the PR-COIN network
IntroductionJuvenile idiopathic arthritis (JIA) is the most common childhood rheumatic disease which is commonly monitored by a combination of history, physical examination, bloodw...

Back to Top