Javascript must be enabled to continue!
Abstract 820: Decolonizing data: Diversifying cancer registries to include SWANA
View through CrossRef
Abstract
Southwest Asian/North African communities (SWANA) make up over 3-4% of immigrants in the U.S. and yet their health status is largely unknown because these ethnic groups are misclassified within the U.S. racial schema as White, deeming them ‘invisible minorities’. Administrative forms specify that White includes Middle-Eastern but SWANA persons may also self-identify as Black, Asian, and Other. With the rise of Islamophobia and increased US intervention in the Middle Eastern region, SWANA Americans face unique challenges that require a deeper understanding of their health status.One methodology to obtain cancer statistics on SWANA is using naming algorithms. Similar to SWANA, the Latine population was invisible in administrative data prior to the 1970’s. Grassroots efforts and advocacy from the Latine community led to the development of validated Latine surname algorithms which have been implemented by the National Cancer Institute. Similarly, SWANA activists have advocated for the creation of a federal identification category for over 50 years arguing that SWANA communities are not perceived as White due, in large part, to a long-standing history of political racism in the United States.The purpose of this study was to develop a SWANA Surname Algorithm (SSA) to inclusively identify SWANA in cancer health data. We used surnames by country of descent to leverage interpretable decision trees to effectively distinguish SWANA from non-SWANA individuals by iteratively selecting the best surname roots at which to split the data to maximize the separation of SWANA individuals from others based on their surname. We integrated these patterns into our SSA so that when presented with a new surname, the algorithm simply follows the decision patterns down to the leaf nodes, otherwise known as the predicted class (SWANA vs non-SWANA).We developed a preliminary SWANA Surname List (SSL) using publicly available naming databases by country of origin (N=71,300). We cross-referenced the SSL against the VCU Massey Cancer Center data repository and found 4.9% of all cancer patients from 2016 to 2020 matched as SWANA. Notably, the prevalence of SWANA patients has been increasing over the last few decades, 3.8% in 1991-1995, to 4.2% in 2001-2005, and then most recently 4.9% in 2016-2020. We will use our SSA to validate these findings. These preliminary findings underscore the valuable insights that naming algorithms can provide in elucidating the true demographic composition of cancer patients. Lack of racial/ethnic disaggregation perpetuates existing inequities in access to essential health resources among SWANA communities. The inclusion of SWANA in cancer disparities research would allow researchers to better examine the cancer health status of this underrepresented but growing community while also aligning with the true racialization of SWANA in the United States.
Citation Format: Guleer Shahab, Michael Preston. Decolonizing data: Diversifying cancer registries to include SWANA [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 820.
Title: Abstract 820: Decolonizing data: Diversifying cancer registries to include SWANA
Description:
Abstract
Southwest Asian/North African communities (SWANA) make up over 3-4% of immigrants in the U.
S.
and yet their health status is largely unknown because these ethnic groups are misclassified within the U.
S.
racial schema as White, deeming them ‘invisible minorities’.
Administrative forms specify that White includes Middle-Eastern but SWANA persons may also self-identify as Black, Asian, and Other.
With the rise of Islamophobia and increased US intervention in the Middle Eastern region, SWANA Americans face unique challenges that require a deeper understanding of their health status.
One methodology to obtain cancer statistics on SWANA is using naming algorithms.
Similar to SWANA, the Latine population was invisible in administrative data prior to the 1970’s.
Grassroots efforts and advocacy from the Latine community led to the development of validated Latine surname algorithms which have been implemented by the National Cancer Institute.
Similarly, SWANA activists have advocated for the creation of a federal identification category for over 50 years arguing that SWANA communities are not perceived as White due, in large part, to a long-standing history of political racism in the United States.
The purpose of this study was to develop a SWANA Surname Algorithm (SSA) to inclusively identify SWANA in cancer health data.
We used surnames by country of descent to leverage interpretable decision trees to effectively distinguish SWANA from non-SWANA individuals by iteratively selecting the best surname roots at which to split the data to maximize the separation of SWANA individuals from others based on their surname.
We integrated these patterns into our SSA so that when presented with a new surname, the algorithm simply follows the decision patterns down to the leaf nodes, otherwise known as the predicted class (SWANA vs non-SWANA).
We developed a preliminary SWANA Surname List (SSL) using publicly available naming databases by country of origin (N=71,300).
We cross-referenced the SSL against the VCU Massey Cancer Center data repository and found 4.
9% of all cancer patients from 2016 to 2020 matched as SWANA.
Notably, the prevalence of SWANA patients has been increasing over the last few decades, 3.
8% in 1991-1995, to 4.
2% in 2001-2005, and then most recently 4.
9% in 2016-2020.
We will use our SSA to validate these findings.
These preliminary findings underscore the valuable insights that naming algorithms can provide in elucidating the true demographic composition of cancer patients.
Lack of racial/ethnic disaggregation perpetuates existing inequities in access to essential health resources among SWANA communities.
The inclusion of SWANA in cancer disparities research would allow researchers to better examine the cancer health status of this underrepresented but growing community while also aligning with the true racialization of SWANA in the United States.
Citation Format: Guleer Shahab, Michael Preston.
Decolonizing data: Diversifying cancer registries to include SWANA [abstract].
In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA.
Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 820.
Related Results
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Abstract
A cervical rib (CR), also known as a supernumerary or extra rib, is an additional rib that forms above the first rib, resulting from the overgrowth of the transverse proce...
Edoxaban and Cancer-Associated Venous Thromboembolism: A Meta-analysis of Clinical Trials
Edoxaban and Cancer-Associated Venous Thromboembolism: A Meta-analysis of Clinical Trials
Abstract
Introduction
Cancer patients face a venous thromboembolism (VTE) risk that is up to 50 times higher compared to individuals without cancer. In 2010, direct oral anticoagul...
Decolonizing Approaches to Latin American Social Movements
Decolonizing Approaches to Latin American Social Movements
Abstract
This chapter introduces decolonizing approaches as a perspective from which to study Latin American social movements. Decolonizing approaches allow us to su...
Breast Carcinoma within Fibroadenoma: A Systematic Review
Breast Carcinoma within Fibroadenoma: A Systematic Review
Abstract
Introduction
Fibroadenoma is the most common benign breast lesion; however, it carries a potential risk of malignant transformation. This systematic review provides an ove...
Computerized cancer registries solutions - a systematic review (Preprint)
Computerized cancer registries solutions - a systematic review (Preprint)
BACKGROUND
A cancer registry (CR) is typically a standardized tool to produce population-based data on cancer incidence and survival. Cancer registries aim ...
Abstract OI-1: OI-1 Decoding breast cancer predisposition genes
Abstract OI-1: OI-1 Decoding breast cancer predisposition genes
Abstract
Women with one or more first-degree female relatives with a history of breast cancer have a two-fold increased risk of developing breast cancer. This risk i...
Keratoplasty Registries: Lessons Learned
Keratoplasty Registries: Lessons Learned
Abstract:
Clinical registries have been developed for decades in the field of ophthalmology, and they are especially well-suited to the study of keratoplasty practices. A...
Predictors of False-Negative Axillary FNA Among Breast Cancer Patients: A Cross-Sectional Study
Predictors of False-Negative Axillary FNA Among Breast Cancer Patients: A Cross-Sectional Study
Abstract
Introduction
Fine-needle aspiration (FNA) is commonly used to investigate lymphadenopathy of suspected metastatic origin. The current study aims to find the association be...

