Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A novel approach to learning through categorical variables applicable to the classification of solitary pulmonary nodule malignancy

View through CrossRef
Abstract Background One of the main drawbacks in constructing a classification model is that some or all of the covariates are categorical variables. Classical methods either assign labels to each output of a categorical variable or are summarised measures (frequencies and percentages), which can be interpreted as probabilities. Methods We adopted a novel mathematical procedure to construct a classification model from categorical variables based on a non-classical probability approach. More specifically, we codified the variables following the categorical data representation from the Discriminant Correspondence Analysis before constructing a non-classical probability matrix system that represents an entangled system of dependent-independent variables. We then developed a disentangled procedure to obtain an empirical density function for each representative class (minimum of two classes). Finally, we constructed our classification model using the density functions. Results We applied the proposed procedure to build a classification model of the malignancy of Solitary Pulmonary Nodule (SPN) after five years of follow up using routine clinical data. First, with 2/3 (270) of the sample of 404 patients with SPN, we constructed the classification model, and then validated it with the remaining 1/3 (134) we validated it. We tested the procedure’s stability by repeating the analysis randomly 1000 times. We obtained a model accuracy of 0.74, an F1 score of 0.58, a Cohen’s Kappa value of 0.41 and a Matthews Correlation Coefficient of 0.45. Finally, the area under the ROC curve was 0.86. Conclusion The proposed procedure provides a machine learning classification model with an acceptable performance of a classification model of solitary pulmonary nodule malignancy constructed from routine clinical data and mainly composed of categorical variables. It provides an acceptable performance, which could be used by clinicians as a tool to classify SPN malignancy in routine clinical practice.
Title: A novel approach to learning through categorical variables applicable to the classification of solitary pulmonary nodule malignancy
Description:
Abstract Background One of the main drawbacks in constructing a classification model is that some or all of the covariates are categorical variables.
Classical methods either assign labels to each output of a categorical variable or are summarised measures (frequencies and percentages), which can be interpreted as probabilities.
Methods We adopted a novel mathematical procedure to construct a classification model from categorical variables based on a non-classical probability approach.
More specifically, we codified the variables following the categorical data representation from the Discriminant Correspondence Analysis before constructing a non-classical probability matrix system that represents an entangled system of dependent-independent variables.
We then developed a disentangled procedure to obtain an empirical density function for each representative class (minimum of two classes).
Finally, we constructed our classification model using the density functions.
Results We applied the proposed procedure to build a classification model of the malignancy of Solitary Pulmonary Nodule (SPN) after five years of follow up using routine clinical data.
First, with 2/3 (270) of the sample of 404 patients with SPN, we constructed the classification model, and then validated it with the remaining 1/3 (134) we validated it.
We tested the procedure’s stability by repeating the analysis randomly 1000 times.
We obtained a model accuracy of 0.
74, an F1 score of 0.
58, a Cohen’s Kappa value of 0.
41 and a Matthews Correlation Coefficient of 0.
45.
Finally, the area under the ROC curve was 0.
86.
Conclusion The proposed procedure provides a machine learning classification model with an acceptable performance of a classification model of solitary pulmonary nodule malignancy constructed from routine clinical data and mainly composed of categorical variables.
It provides an acceptable performance, which could be used by clinicians as a tool to classify SPN malignancy in routine clinical practice.

Related Results

Clinicopathological Features of Indeterminate Thyroid Nodules: A Single-center Cross-sectional Study
Clinicopathological Features of Indeterminate Thyroid Nodules: A Single-center Cross-sectional Study
Abstract Introduction Due to indeterminate cytology, Bethesda III is the most controversial category within the Bethesda System for Reporting Thyroid Cytopathology. This study exam...
Primary Thyroid Non-Hodgkin B-Cell Lymphoma: A Case Series
Primary Thyroid Non-Hodgkin B-Cell Lymphoma: A Case Series
Abstract Introduction Non-Hodgkin lymphoma (NHL) of the thyroid, a rare malignancy linked to autoimmune disorders, is poorly understood in terms of its pathogenesis and treatment o...
Clinico Pathological Study of Solitary Thyroid Nodule - A Prospective Study of 50 Cases.
Clinico Pathological Study of Solitary Thyroid Nodule - A Prospective Study of 50 Cases.
Background: A common Presentation of thyroid disorder is solitary nodule. A discrete swelling in an impalpable gland is termed as solitary nodule of thyroid. The majority of solita...
Influence of seabed heterogeneity on benthic megafaunal community patterns in abyssal nodule fields
Influence of seabed heterogeneity on benthic megafaunal community patterns in abyssal nodule fields
Polymetallic nodule fields, at 3000–6000 m depth, harbour some of the most diverse seabed communities in the abyss. In these habitats, nodules are keystone structures for many sess...
Multimodality imaging of chronic thromboembolic pulmonary hypertension : new insights into old challenges
Multimodality imaging of chronic thromboembolic pulmonary hypertension : new insights into old challenges
<p dir="ltr"><b>BACKGROUND:</b><br><br>Most forms of pulmonary hypertension carry unsatisfactory prognosis with the notable exception of chronic throm...
Multimodality imaging of chronic thromboembolic pulmonary hypertension : new insights into old challenges
Multimodality imaging of chronic thromboembolic pulmonary hypertension : new insights into old challenges
<p dir="ltr"><b>BACKGROUND:</b><br><br>Most forms of pulmonary hypertension carry unsatisfactory prognosis with the notable exception of chronic throm...
Solitary Pulmonary Nodule Malignancy Predictive Models Applicable to Routine Clinical Practice: A Systematic Review&nbsp;
Solitary Pulmonary Nodule Malignancy Predictive Models Applicable to Routine Clinical Practice: A Systematic Review&nbsp;
Abstract Background: Solitary Pulmonary Nodule (SPN) is a common finding in routine clinical practice when performing chest imaging tests. The vast majority of these nodule...

Back to Top