Javascript must be enabled to continue!

Using Natural Language Processing to Automatically Assess Feedback Quality: Findings From 3 Surgical Residencies

Purpose Learning is markedly improved with high-quality feedback, yet assuring the quality of feedback is difficult to achieve at scale. Natural language processing (NLP) algorithms may be useful in this context as they can automatically classify large volumes of narrative data. However, it is unknown if NLP models can accurately evaluate surgical trainee feedback. This study evaluated which NLP techniques best classify the quality of surgical trainee formative feedback recorded as part of a workplace assessment. Method During the 2016–2017 academic year, the SIMPL (Society for Improving Medical Professional Learning) app was used to record operative performance narrative feedback for residents at 3 university-based general surgery residency training programs. Feedback comments were collected for a sample of residents representing all 5 postgraduate year levels and coded for quality. In May 2019, the coded comments were then used to train NLP models to automatically classify the quality of feedback across 4 categories (effective, mediocre, ineffective, or other). Models included support vector machines (SVM), logistic regression, gradient boosted trees, naive Bayes, and random forests. The primary outcome was mean classification accuracy. Results The authors manually coded the quality of 600 recorded feedback comments. Those data were used to train NLP models to automatically classify the quality of feedback across 4 categories. The NLP model using an SVM algorithm yielded a maximum mean accuracy of 0.64 (standard deviation, 0.01). When the classification task was modified to distinguish only high-quality vs low-quality feedback, maximum mean accuracy was 0.83, again with SVM. Conclusions To the authors’ knowledge, this is the first study to examine the use of NLP for classifying feedback quality. SVM NLP models demonstrated the ability to automatically classify the quality of surgical trainee evaluations. Larger training datasets would likely further increase accuracy.

Oxford University Press (OUP)

Erkin Ötleş Daniel E. Kendrick Quintin P. Solano Mary Schuller Samantha L. Ahle Mickyas H. Eskender Emily Carnes Brian C. George

Academic Medicine

2021

Title: Using Natural Language Processing to Automatically Assess Feedback Quality: Findings From 3 Surgical Residencies

Description:

Purpose Learning is markedly improved with high-quality feedback, yet assuring the quality of feedback is difficult to achieve at scale.

Natural language processing (NLP) algorithms may be useful in this context as they can automatically classify large volumes of narrative data.

However, it is unknown if NLP models can accurately evaluate surgical trainee feedback.

This study evaluated which NLP techniques best classify the quality of surgical trainee formative feedback recorded as part of a workplace assessment.

Method During the 2016–2017 academic year, the SIMPL (Society for Improving Medical Professional Learning) app was used to record operative performance narrative feedback for residents at 3 university-based general surgery residency training programs.

Feedback comments were collected for a sample of residents representing all 5 postgraduate year levels and coded for quality.

In May 2019, the coded comments were then used to train NLP models to automatically classify the quality of feedback across 4 categories (effective, mediocre, ineffective, or other).

Models included support vector machines (SVM), logistic regression, gradient boosted trees, naive Bayes, and random forests.

The primary outcome was mean classification accuracy.

Results The authors manually coded the quality of 600 recorded feedback comments.

Those data were used to train NLP models to automatically classify the quality of feedback across 4 categories.

The NLP model using an SVM algorithm yielded a maximum mean accuracy of 0.

64 (standard deviation, 0.

01).

When the classification task was modified to distinguish only high-quality vs low-quality feedback, maximum mean accuracy was 0.

83, again with SVM.

Conclusions To the authors’ knowledge, this is the first study to examine the use of NLP for classifying feedback quality.

SVM NLP models demonstrated the ability to automatically classify the quality of surgical trainee evaluations.

Larger training datasets would likely further increase accuracy.

Back

Abstarct Introduction Isolated brain hydatid disease (BHD) is an extremely rare form of echinococcosis. A prompt and timely diagnosis is a crucial step in disease management. This ...

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas

<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...

Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program

Abstract Funding Acknowledgements Type of funding sources: None. INTRODUCTION Patients with heart failure (HF)...

Written Feedback In Second Language Writing: Perceptions Of Vietnamese Teachers And Students

<p>Writing can be very challenging for ESL students since they need to overcome the changes associated with academic writing styles and their mechanics in order to improve th...

Global Surgery Research: An Overview and the Role of Medical Students and Surgical Trainees in Advancing Global Surgery Research in LMICs

Global surgery research is a critical area of study aimed at enhancing access to safe and effective surgical care for patients in low- and middle-income countries (LMICs). It is es...

An empirical investigation of contemporary performance management systems

This dissertation provides a comprehensive empirical analysis of contemporary performance management systems (PMS), with a focus on how evolving feedback practices—particularly nar...

A Wideband mm-Wave Printed Dipole Antenna for 5G Applications

<span lang="EN-MY">In this paper, a wideband millimeter-wave (mm-Wave) printed dipole antenna is proposed to be used for fifth generation (5G) communications. The single elem...

Authentic feedback

Authentic assessment calls for authentic feedback (Dawson et al., 2021). Authentic feedback promotes the development of capabilities that transfer effectively from university to th...

Email:
Password:

Email:

Using Natural Language Processing to Automatically Assess Feedback Quality: Findings From 3 Surgical Residencies

Related Results