Javascript must be enabled to continue!
Phish: A Novel Hyper-Optimizable Activation Function
View through CrossRef
Deep-learning models estimate values using backpropagation. The
activation function within hidden layers is a critical component to
minimizing loss in deep neural-networks. Rectified Linear (ReLU) has
been the dominant activation function for the past decade. Swish and
Mish are newer activation functions that have shown to yield better
results than ReLU given specific circumstances. Phish is a novel
activation function proposed here. It is a composite function defined as
f(x) = xTanH(GELU(x)), where no discontinuities are apparent in the
differentiated graph on the domain observed. Four generalized networks
were constructed using Phish, Swish, Sigmoid, and TanH. SoftMax was the
output function. Using images from MNIST and CIFAR-10 databanks, these
networks were trained to minimize sparse categorical crossentropy. A
large scale cross-validation was simulated using stochastic Markov
chains to account for the law of large numbers for the probability
values. Statistical tests support the research hypothesis stating Phish
could outperform other activation functions in classification. Future
experiments would involve testing Phish in unsupervised learning
algorithms and comparing it to more activation functions.
Title: Phish: A Novel Hyper-Optimizable Activation Function
Description:
Deep-learning models estimate values using backpropagation.
The
activation function within hidden layers is a critical component to
minimizing loss in deep neural-networks.
Rectified Linear (ReLU) has
been the dominant activation function for the past decade.
Swish and
Mish are newer activation functions that have shown to yield better
results than ReLU given specific circumstances.
Phish is a novel
activation function proposed here.
It is a composite function defined as
f(x) = xTanH(GELU(x)), where no discontinuities are apparent in the
differentiated graph on the domain observed.
Four generalized networks
were constructed using Phish, Swish, Sigmoid, and TanH.
SoftMax was the
output function.
Using images from MNIST and CIFAR-10 databanks, these
networks were trained to minimize sparse categorical crossentropy.
A
large scale cross-validation was simulated using stochastic Markov
chains to account for the law of large numbers for the probability
values.
Statistical tests support the research hypothesis stating Phish
could outperform other activation functions in classification.
Future
experiments would involve testing Phish in unsupervised learning
algorithms and comparing it to more activation functions.
Related Results
Phish: A Novel Hyper-Optimizable Activation Function
Phish: A Novel Hyper-Optimizable Activation Function
Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectifi...
Phish: A Novel Hyper-Optimizable Activation Function
Phish: A Novel Hyper-Optimizable Activation Function
Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectifi...
Phish: A Novel Hyper-Optimizable Activation Function
Phish: A Novel Hyper-Optimizable Activation Function
Deep-learning models estimate values using backpropagation. The
activation function within hidden layers is a critical component to
minimizing loss in deep neural-networks. Rectifi...
The Comparative Study for Predicting Disease Outbreak
The Comparative Study for Predicting Disease Outbreak
To know the prediction of disease outbreak, proper predictive modeling is required to represent the dataset. This study presents the comparative predictive modeling for predicting ...
NIST Phish Scale user guide
NIST Phish Scale user guide
The phishing cyber threat exploits vulnerabilities in the U.S. and around the world across private and public sectors. Embedded phishing awareness training programs, where simulate...
Hyper MV ‐ideals in hyper MV ‐algebras
Hyper MV ‐ideals in hyper MV ‐algebras
AbstractIn this paper we define the hyper operations ⊗, ∨ and ∧ on a hyper MV ‐algebra and we obtain some related results. After that by considering the notions ofhyper MV ‐ideals ...
Analytical Computation of Hyper-Ellipsoidal Harmonics
Analytical Computation of Hyper-Ellipsoidal Harmonics
The four-dimensional ellipsoid of an anisotropic hyper-structure corresponds to the four-dimensional sphere of an isotropic hyper-structure. In three dimensions, both theories for ...
Intuitionistic Fuzzy Soft Hyper BCK Algebras
Intuitionistic Fuzzy Soft Hyper BCK Algebras
Maji et al. introduced the concept of fuzzy soft sets as a generalization of the standard soft sets, and presented an application of fuzzy soft sets in a decision-making problem. M...

