Javascript must be enabled to continue!
RLion: A Refined Lion Optimizer for Deep Learning
View through CrossRef
Abstract
Optimization algorithms play a fundamental role in training neural networks. The optimizer focuses on the updating weights ofmomentum and velocity on learning rates and losses, furthermore the complexity of the optimizer and the quantity of updatedparameters are considered. In this paper, a RLion(Refined Lion Optimizer) based on the Lion optimizer is denoted with ascaleable factor α and arctan operation to denote the update rule θt = θt−1 − ηt ( 2π arctan(α ∗ ˆmt ) + λ θt−1). arctan is continousmonotonic function and its expectation, variation are less than those of sign, also the θ ’s fluctuation of RLion is less than that ofLion. The higher α, the convey faster. The RLion is able to smooth out the fluctuations, converge faster and more reliable.The FasterNet, EfficientNetV2 and the YOLO_V8 with ImageNet1k dataset are trained without warm up for classificationleveraging the RLion optimizer. Object detection with Vision Transformers on Caltech 101 dataset and the DeepLabV3+ forsemantic segmentation on camera data are trained with AdamW, Lion and RLion optimizer too. Compared to the AdamWand Lion optimizer, the loss and accuracy present the RLion can promote the validation accuracy about 0 ∼ +20% higher thanAdamW on many models even the learning rate is as high as AdamW. The RLion has better convergence performance andversatility.
Title: RLion: A Refined Lion Optimizer for Deep Learning
Description:
Abstract
Optimization algorithms play a fundamental role in training neural networks.
The optimizer focuses on the updating weights ofmomentum and velocity on learning rates and losses, furthermore the complexity of the optimizer and the quantity of updatedparameters are considered.
In this paper, a RLion(Refined Lion Optimizer) based on the Lion optimizer is denoted with ascaleable factor α and arctan operation to denote the update rule θt = θt−1 − ηt ( 2π arctan(α ∗ ˆmt ) + λ θt−1).
arctan is continousmonotonic function and its expectation, variation are less than those of sign, also the θ ’s fluctuation of RLion is less than that ofLion.
The higher α, the convey faster.
The RLion is able to smooth out the fluctuations, converge faster and more reliable.
The FasterNet, EfficientNetV2 and the YOLO_V8 with ImageNet1k dataset are trained without warm up for classificationleveraging the RLion optimizer.
Object detection with Vision Transformers on Caltech 101 dataset and the DeepLabV3+ forsemantic segmentation on camera data are trained with AdamW, Lion and RLion optimizer too.
Compared to the AdamWand Lion optimizer, the loss and accuracy present the RLion can promote the validation accuracy about 0 ∼ +20% higher thanAdamW on many models even the learning rate is as high as AdamW.
The RLion has better convergence performance andversatility.
Related Results
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
LION FIGURES AND ICONOGRAPHY ON THE DOOR KNOCKERS OF HOCA AHMED YESEVI TOMB
LION FIGURES AND ICONOGRAPHY ON THE DOOR KNOCKERS OF HOCA AHMED YESEVI TOMB
By blending old Turkish beliefs with Islam, Hodja Ahmet Yesevi ensured the formation of an understanding of Sufism based on Turkish wisdom, love of Allah, tolerance and human love....
Identifikasi Stroke Menggunakan Metode Transfer learning Arsitektur Convolutional Neural Network Pada Citra CT-scan Kepala
Identifikasi Stroke Menggunakan Metode Transfer learning Arsitektur Convolutional Neural Network Pada Citra CT-scan Kepala
Stroke menjadi penyebab terbesar atas kecatatan dan kematian pada masyarakat Indonesia. Tingkat penderita stroke yang tertinggi di wilayah Asia Tenggara adalah Indonesia. Hal terse...
Hybrid Gradient Descent Grey Wolf Optimizer for Optimal Feature Selection
Hybrid Gradient Descent Grey Wolf Optimizer for Optimal Feature Selection
Feature selection is the process of decreasing the number of features in a dataset by removing redundant, irrelevant, and randomly class‐corrected data features. By applying featur...
Deep ocular tumor classification model using cuckoo search algorithm and Caputo fractional gradient descent
Deep ocular tumor classification model using cuckoo search algorithm and Caputo fractional gradient descent
While digital ocular fundus images are commonly used for diagnosing ocular tumors, interpreting these images poses challenges due to their complexity and the subtle features specif...
印尼泗水黄龙体育会龙狮队的发展研究
印尼泗水黄龙体育会龙狮队的发展研究
【摘要】舞狮是源自中国的一种民间艺术活动,其形成经历了漫长的历史过程,慢慢成为了一种独特的中国传统文化。这种文化在中国得到广泛的传播,后随着华人移民到世界各地,舞狮活动也被带入,可以说是有华人的地方就有舞狮。而印度尼西亚作为一个拥有大量华人移民的国家,舞狮活动也在本国传播,舞狮活动在传播中结合吸收了当地的文化,有自己的特色,成为了印尼文化的组成部分。本文以泗...
Deep Learning: Implications for Human Learning and Memory
Deep Learning: Implications for Human Learning and Memory
Recent years have seen an explosion of interest in deep learning and deep neural networks. Deep learning lies at the heart of unprecedented feats of machine intelligence as well as...

