Javascript must be enabled to continue!

Phish: A Novel Hyper-Optimizable Activation Function

Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectified Linear (ReLU) has been the dominant activation function for the past decade. Swish and Mish are newer activation functions that have shown to yield better results than ReLU given specific circumstances. Phish is a novel non-monotonic activation function proposed here. It is a composite function defined as f(x) = xTanH(GELU(x)), where no discontinuities are apparent in the differentiated graph on the domain observed. Four generalized networks were constructed using different activation functions. SoftMax was the output function. Using images from MNIST and CIFAR-10 databanks, these networks were trained to minimize sparse categorical crossentropy. A large-scale cross-validation was simulated using stochastic Markov chains to account for the law of large numbers for the probability values. Statistical tests support the research hypothesis stating Phish could outperform other activation functions in classification. Future experiments would involve testing Phish in unsupervised learning algorithms and comparing it to more activation functions.

Institute of Electrical and Electronics Engineers (IEEE)

Philip Naveen

2021

Title: Phish: A Novel Hyper-Optimizable Activation Function

Description:

Deep-learning models estimate values using backpropagation.

The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks.

Rectified Linear (ReLU) has been the dominant activation function for the past decade.

Swish and Mish are newer activation functions that have shown to yield better results than ReLU given specific circumstances.

Phish is a novel non-monotonic activation function proposed here.

It is a composite function defined as f(x) = xTanH(GELU(x)), where no discontinuities are apparent in the differentiated graph on the domain observed.

Four generalized networks were constructed using different activation functions.

SoftMax was the output function.

Using images from MNIST and CIFAR-10 databanks, these networks were trained to minimize sparse categorical crossentropy.

A large-scale cross-validation was simulated using stochastic Markov chains to account for the law of large numbers for the probability values.

Statistical tests support the research hypothesis stating Phish could outperform other activation functions in classification.

Future experiments would involve testing Phish in unsupervised learning algorithms and comparing it to more activation functions.

Back

Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectifi...

Phish: A Novel Hyper-Optimizable Activation Function

Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectifi...

Phish: A Novel Hyper-Optimizable Activation Function

Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectifi...

The Comparative Study for Predicting Disease Outbreak

To know the prediction of disease outbreak, proper predictive modeling is required to represent the dataset. This study presents the comparative predictive modeling for predicting ...

NIST Phish Scale user guide

The phishing cyber threat exploits vulnerabilities in the U.S. and around the world across private and public sectors. Embedded phishing awareness training programs, where simulate...

Hyper MV ‐ideals in hyper MV ‐algebras

AbstractIn this paper we define the hyper operations ⊗, ∨ and ∧ on a hyper MV ‐algebra and we obtain some related results. After that by considering the notions ofhyper MV ‐ideals ...

Analytical Computation of Hyper-Ellipsoidal Harmonics

The four-dimensional ellipsoid of an anisotropic hyper-structure corresponds to the four-dimensional sphere of an isotropic hyper-structure. In three dimensions, both theories for ...

Intuitionistic Fuzzy Soft Hyper BCK Algebras

Maji et al. introduced the concept of fuzzy soft sets as a generalization of the standard soft sets, and presented an application of fuzzy soft sets in a decision-making problem. M...

Email:
Password:

Email:

Phish: A Novel Hyper-Optimizable Activation Function

Related Results