Javascript must be enabled to continue!
Who is Tweeting? A Scoping Review of Methods to Establish Race and Ethnicity from Twitter Datasets
View through CrossRef
Background: A growing amount of health research uses social media data. Those critical of social media research often cite that it may be unrepresentative of the population. Identifying the demographics of social media users enables us to measure the representativeness. Extracting race or ethnicity from social media data can be difficult and researchers may choose from a multitude of different approaches. Methods: We present a scoping review to identify the methods used to extract race or ethnicity from Twitter datasets. We searched 16 electronic databases and carried out reference checking in order to identify relevant articles. Sifting of each record was undertaken independently by at least two researchers with any disagreement discussed. The research could be grouped by the methods applied to extract race or ethnicity.Results: From 1093 records we identified 56 that met our inclusion criteria. The majority focus on Twitter users based in the US. A range of types of data were used including Twitter profile -pictures, bios, and/or location, and the content in the tweets themselves. The methods used were wide ranging and included using manual inference, linkage to census data, commercial software, language/dialect recognition and machine learning. Not all studies evaluated their methods. Those that did found accuracy to vary from 45% to 93% with significantly lower accuracy identifying non-white race categories. There may be some ethical questions over some of the methods used, particularly using photos or dialect, as well as questions surrounding accuracy.Conclusion: There is no standard approach or guidelines for extracting race or ethnicity from Twitter or other social media. Social media researchers must use careful interpretation of race or ethnicity and not over-promise what can be achieved, as even manual screening is a subjective, imperfect method. Future research should establish the accuracy of methods to inform evidence-based best practice guidelines for social media researchers, and be guided by concerns of equity and social justice.
Center for Open Science
Title: Who is Tweeting? A Scoping Review of Methods to Establish Race and Ethnicity from Twitter Datasets
Description:
Background: A growing amount of health research uses social media data.
Those critical of social media research often cite that it may be unrepresentative of the population.
Identifying the demographics of social media users enables us to measure the representativeness.
Extracting race or ethnicity from social media data can be difficult and researchers may choose from a multitude of different approaches.
Methods: We present a scoping review to identify the methods used to extract race or ethnicity from Twitter datasets.
We searched 16 electronic databases and carried out reference checking in order to identify relevant articles.
Sifting of each record was undertaken independently by at least two researchers with any disagreement discussed.
The research could be grouped by the methods applied to extract race or ethnicity.
Results: From 1093 records we identified 56 that met our inclusion criteria.
The majority focus on Twitter users based in the US.
A range of types of data were used including Twitter profile -pictures, bios, and/or location, and the content in the tweets themselves.
The methods used were wide ranging and included using manual inference, linkage to census data, commercial software, language/dialect recognition and machine learning.
Not all studies evaluated their methods.
Those that did found accuracy to vary from 45% to 93% with significantly lower accuracy identifying non-white race categories.
There may be some ethical questions over some of the methods used, particularly using photos or dialect, as well as questions surrounding accuracy.
Conclusion: There is no standard approach or guidelines for extracting race or ethnicity from Twitter or other social media.
Social media researchers must use careful interpretation of race or ethnicity and not over-promise what can be achieved, as even manual screening is a subjective, imperfect method.
Future research should establish the accuracy of methods to inform evidence-based best practice guidelines for social media researchers, and be guided by concerns of equity and social justice.
Related Results
Faith Tweets: Ambient Religious Communication and Microblogging Rituals
Faith Tweets: Ambient Religious Communication and Microblogging Rituals
There’s no reason to think that Jesus wouldn’t have Facebooked or twittered if he came into the world now. Can you imagine his killer status updates? Reverend Schenck, New York, Al...
Alts and Automediality: Compartmentalising the Self through Multiple Social Media Profiles
Alts and Automediality: Compartmentalising the Self through Multiple Social Media Profiles
IntroductionAlt, or alternative, accounts are secondary profiles people use in addition to a main account on a social media platform. They are a kind of automediation, a way of rep...
Methods to Establish Race or Ethnicity of Twitter Users: Scoping Review (Preprint)
Methods to Establish Race or Ethnicity of Twitter Users: Scoping Review (Preprint)
BACKGROUND
A growing amount of health research uses social media data. Those critical of social media research often cite that it may be unrepresentative of...
A Twitter Sentimen Analysis on Islamic Banking Using Drone Emprit Academic (DEA): Evidence from Indonesia
A Twitter Sentimen Analysis on Islamic Banking Using Drone Emprit Academic (DEA): Evidence from Indonesia
ABSTRACT
The research aimed to identify and collect issues discussed regarding Islamic banking from user activity, sentimen, and content on Twitter. This study used a qualitative a...
Mindy Calling: Size, Beauty, Race in The Mindy Project
Mindy Calling: Size, Beauty, Race in The Mindy Project
When characters in the Fox Television sitcom The Mindy Project call Mindy Lahiri fat, Mindy sees it as a case of misidentification. She reminds the character that she is a “petite ...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract
The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
SOCIAL ANXIETY DAN ONLINE SELF-DISCLOSURE PADA MAHASISWA PENGGUNA TWITTER/X
SOCIAL ANXIETY DAN ONLINE SELF-DISCLOSURE PADA MAHASISWA PENGGUNA TWITTER/X
Abstrak - Mahasiswa berada dalam tahapan usia emerging adulthood yang memiliki tugas perkembangan untuk menjalin relasi. Dalam usahanya menjalin relasi, mahasiswa menggunakan aplik...
Komunikasi Verbal Body Shaming di Media Sosial Twitter terhadap Kepercayaan Diri Remaja
Komunikasi Verbal Body Shaming di Media Sosial Twitter terhadap Kepercayaan Diri Remaja
Bullying of body shaming is increasingly prevalent on Twitter. With the existence of social media, bullying often occurs in cyberspace. Verbal communication is...

