Javascript must be enabled to continue!
Prioritized Experience Replay Based on dynamics priority
View through CrossRef
Abstract
Experience replay has been instrumental in achieving significant advancements in reinforcement learning by increasing the utilization of data. To further improve the sampling efficiency, prioritized experience replay (PER) was proposed. This algorithm prioritizes experiences based on the temporal difference error (TD error), enabling the agent to learn from more valuable experiences stored in the experience pool. While various prioritized algorithms have been proposed, they ignored the dynamic changes of experience value during the training process, merely combining different priority criteria in a fixed or linear manner. In this paper, we present a novel prioritized experience replay algorithm called PERDP, which employs a dynamic priority adjustment framework. PERDP adaptively adjusts the weights of each criterion based on average priority level of the experience pool and evaluates experiences’ value according to current network. We apply this algorithm to the SAC model and conduct experiments in the OpenAI Gym experimental environment. The experiment results demonstrate that the PERDP exhibits superior convergence speed when compared to the PER.
Title: Prioritized Experience Replay Based on dynamics priority
Description:
Abstract
Experience replay has been instrumental in achieving significant advancements in reinforcement learning by increasing the utilization of data.
To further improve the sampling efficiency, prioritized experience replay (PER) was proposed.
This algorithm prioritizes experiences based on the temporal difference error (TD error), enabling the agent to learn from more valuable experiences stored in the experience pool.
While various prioritized algorithms have been proposed, they ignored the dynamic changes of experience value during the training process, merely combining different priority criteria in a fixed or linear manner.
In this paper, we present a novel prioritized experience replay algorithm called PERDP, which employs a dynamic priority adjustment framework.
PERDP adaptively adjusts the weights of each criterion based on average priority level of the experience pool and evaluates experiences’ value according to current network.
We apply this algorithm to the SAC model and conduct experiments in the OpenAI Gym experimental environment.
The experiment results demonstrate that the PERDP exhibits superior convergence speed when compared to the PER.
Related Results
KONSEP PENGEMBANGAN DIRI ARISTOTELES
KONSEP PENGEMBANGAN DIRI ARISTOTELES
<p><!--[if gte mso 9]><xml> <w:WordDocument> <w:View>Normal</w:View> <w:Zoom>0</w:Zoom> <w:TrackMoves /> <w:TrackFormatting...
Diarréia nosocomial e outras infecções adquiridas em hospital universitário
Diarréia nosocomial e outras infecções adquiridas em hospital universitário
<!--[if gte mso 9]><xml> <w:WordDocument> <w:View>Normal</w:View> <w:Zoom>0</w:Zoom> <w:TrackMoves /> <w:TrackFormatting /> &l...
A CHINA E A TRANSIÇÃO SOCIALISTA – UM BREVE BOSQUEJO
A CHINA E A TRANSIÇÃO SOCIALISTA – UM BREVE BOSQUEJO
<!--[if gte mso 9]><xml> <o:DocumentProperties> <o:Revision>0</o:Revision> <o:TotalTime>0</o:TotalTime> <o:Pages>1</o:Pages> &...
Integrasi Pendidikan Cinta Tanah Air dalam Kurikulum Tersembunyi Berbasis Karakter Kebangsaan
Integrasi Pendidikan Cinta Tanah Air dalam Kurikulum Tersembunyi Berbasis Karakter Kebangsaan
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>&...
Traditional Knowledge of Asmat Ethnic Group in Using Woods as Carving Materials at Asmat District
Traditional Knowledge of Asmat Ethnic Group in Using Woods as Carving Materials at Asmat District
<!--[if gte mso 9]><xml> <w:WordDocument> <w:View>Normal</w:View> <w:Zoom>0</w:Zoom> <w:TrackMoves /> <w:TrackFormatting /> &l...
Grevillea papuana as Traditional Medicine by Lani Ethnic Group in Jayawijaya
Grevillea papuana as Traditional Medicine by Lani Ethnic Group in Jayawijaya
<!--[if gte mso 9]><xml> <w:WordDocument> <w:View>Normal</w:View> <w:Zoom>0</w:Zoom> <w:TrackMoves /> <w:TrackFormatting /> &l...
Constantinople as 'New Rome'
Constantinople as 'New Rome'
<!--[if gte mso 9]><xml> <o:DocumentProperties> <o:Revision>0</o:Revision> <o:TotalTime>0</o:TotalTime> <o:Pages>1</o:Pages> &...
IN VITRO HYPOGLYCEMIC AND ANTIMICROBIAL ACTIVITY OF CUCUMIS CALLOSUS (ROTTL.) COGN. FRUIT
IN VITRO HYPOGLYCEMIC AND ANTIMICROBIAL ACTIVITY OF CUCUMIS CALLOSUS (ROTTL.) COGN. FRUIT
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:RelyOnVML/>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
...

