Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Optimizing Random Forests: Spark Implementations of Random Genetic Forests

View through CrossRef
The Random Forest (RF) algorithm, originally proposed by Breiman [7], is a widely used machine learning algorithm that gains its merit from its fast learning speed as well as high classification accuracy. However, despite its widespread use, the different mechanisms at work in Breiman’s RF are not yet fully understood, and there is still on-going research on several aspects of optimizing the RF algorithm, especially in the big data environment. To optimize the RF algorithm, this work builds new ensembles that optimize the random portions of the RF algorithm using genetic algorithms, yielding Random Genetic Forests (RGF), Negatively Correlated RGF (NC-RGF), and Preemptive RGF (PFS-RGF). These ensembles are compared with Breiman’s classic RF algorithm in Hadoop’s big data framework using Spark on a large, high-dimensional network intrusion dataset, UNSW-NB15.
Title: Optimizing Random Forests: Spark Implementations of Random Genetic Forests
Description:
The Random Forest (RF) algorithm, originally proposed by Breiman [7], is a widely used machine learning algorithm that gains its merit from its fast learning speed as well as high classification accuracy.
However, despite its widespread use, the different mechanisms at work in Breiman’s RF are not yet fully understood, and there is still on-going research on several aspects of optimizing the RF algorithm, especially in the big data environment.
To optimize the RF algorithm, this work builds new ensembles that optimize the random portions of the RF algorithm using genetic algorithms, yielding Random Genetic Forests (RGF), Negatively Correlated RGF (NC-RGF), and Preemptive RGF (PFS-RGF).
These ensembles are compared with Breiman’s classic RF algorithm in Hadoop’s big data framework using Spark on a large, high-dimensional network intrusion dataset, UNSW-NB15.

Related Results

Muriel Spark and the Art of Deception: Constructing Plausibility with the Methods of WWII Black Propaganda
Muriel Spark and the Art of Deception: Constructing Plausibility with the Methods of WWII Black Propaganda
Abstract From May to October 1944, Muriel Spark was employed by the Political Warfare Executive (PWE), a secret service created by Britain during the Second World Wa...
Cultural heritage preservation by using blockchain technologies
Cultural heritage preservation by using blockchain technologies
AbstractUbiquitous digitization enables promising options for cultural heritage preservation. Therefore, a new approach is presented that considers deployment scenarios by linking ...
Perfecting Bodies: Who Are the Disabled in Andrew Niccol’s Gattaca?
Perfecting Bodies: Who Are the Disabled in Andrew Niccol’s Gattaca?
This paper will examine the impact of genetic technologies on the corporeal and economical aspects of human lives while emphasizing the ambiguity of disability under these subversi...
Formation of Metal-Intermetallic Laminate Composites by Spark Plasma Sintering of Metal Plates and Powder Work Pieces
Formation of Metal-Intermetallic Laminate Composites by Spark Plasma Sintering of Metal Plates and Powder Work Pieces
Laminate composites with an intermetallic component are some of the most prospective constructional and functional materials. The basic formation method of such materials consists ...
Development of Optimized Phenomic Predictors for Efficient Plant Breeding Decisions Using Phenomic-Assisted Selection in Soybean
Development of Optimized Phenomic Predictors for Efficient Plant Breeding Decisions Using Phenomic-Assisted Selection in Soybean
The rate of advancement made in phenomic-assisted breeding methodologies has lagged those of genomic-assisted techniques, which is now a critical component of mainstream cultivar d...

Back to Top