Javascript must be enabled to continue!
MOSAIC for Multiple-Reward Environments
View through CrossRef
Reinforcement learning (RL) can provide a basic framework for autonomous robots to learn to control and maximize future cumulative rewards in complex environments. To achieve high performance, RL controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands. For example, a robot playing tennis and squash needs to cope with the different dynamics of a tennis or squash racket and such dynamic environmental factors as the wind. In addition, this robot has to tailor its tactics simultaneously under the rules of either game. This double complexity of the external dynamics and reward function sometimes becomes more complex when both the multiple dynamics and multiple reward functions switch implicitly, as in the situation of a real (multi-agent) game of tennis where one player cannot observe the intention of her opponents or her partner. The robot must consider its opponent's and its partner's unobservable behavioral goals (reward function). In this article, we address how an RL agent should be designed to handle such double complexity of dynamics and reward. We have previously proposed modular selection and identification for control (MOSAIC) to cope with nonstationary dynamics where appropriate controllers are selected and learned among many candidates based on the error of its paired dynamics predictor: the forward model. Here we extend this framework for RL and propose MOSAIC-MR architecture. It resembles MOSAIC in spirit and selects and learns an appropriate RL controller based on the RL controller's TD error using the errors of the dynamics (the forward model) and the reward predictors. Furthermore, unlike other MOSAIC variants for RL, RL controllers are not a priori paired with the fixed predictors of dynamics and rewards. The simulation results demonstrate that MOSAIC-MR outperforms other counterparts because of this flexible association ability among RL controllers, forward models, and reward predictors.
MIT Press - Journals
Title: MOSAIC for Multiple-Reward Environments
Description:
Reinforcement learning (RL) can provide a basic framework for autonomous robots to learn to control and maximize future cumulative rewards in complex environments.
To achieve high performance, RL controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands.
For example, a robot playing tennis and squash needs to cope with the different dynamics of a tennis or squash racket and such dynamic environmental factors as the wind.
In addition, this robot has to tailor its tactics simultaneously under the rules of either game.
This double complexity of the external dynamics and reward function sometimes becomes more complex when both the multiple dynamics and multiple reward functions switch implicitly, as in the situation of a real (multi-agent) game of tennis where one player cannot observe the intention of her opponents or her partner.
The robot must consider its opponent's and its partner's unobservable behavioral goals (reward function).
In this article, we address how an RL agent should be designed to handle such double complexity of dynamics and reward.
We have previously proposed modular selection and identification for control (MOSAIC) to cope with nonstationary dynamics where appropriate controllers are selected and learned among many candidates based on the error of its paired dynamics predictor: the forward model.
Here we extend this framework for RL and propose MOSAIC-MR architecture.
It resembles MOSAIC in spirit and selects and learns an appropriate RL controller based on the RL controller's TD error using the errors of the dynamics (the forward model) and the reward predictors.
Furthermore, unlike other MOSAIC variants for RL, RL controllers are not a priori paired with the fixed predictors of dynamics and rewards.
The simulation results demonstrate that MOSAIC-MR outperforms other counterparts because of this flexible association ability among RL controllers, forward models, and reward predictors.
Related Results
The Ganymede Mosaic of Claudiopolis
The Ganymede Mosaic of Claudiopolis
Claudiopolis (Bolu) was a prominent city in Bithynia during the Ancient Period. The Ganymede mosaic was discovered during a rescue excavation at the city center in 2011. The Ganyme...
Coastal environments and long-term human practices in Corfu: a seascape perspective
Coastal environments and long-term human practices in Corfu: a seascape perspective
Seascapes, both as specific ecosystems and as cultural manifestations formed through human action, are important in shaping economic and social relations and entail a range of exp...
A BUILDING WITH MOSAIC IN THE PATARA HARBOR STREET
A BUILDING WITH MOSAIC IN THE PATARA HARBOR STREET
This article is about a mosaic uncovered during the excavations in chamber II on the westportico of the Harbor Street of Patara that connects the city center to the harbor. The mos...
Trakya’daki Philippopolis ve Augusta Traiana’dan Geç 4. - Erken 5. Yüzyıla Tarihlenen Mozaik Döşemeler
Trakya’daki Philippopolis ve Augusta Traiana’dan Geç 4. - Erken 5. Yüzyıla Tarihlenen Mozaik Döşemeler
The present paper deals with the mosaic pavements that embellished the public buildings, semi-public and private houses between the 80s of 4th c. and the first two decades of 5th c...
Mo.Se.: Mosaic image segmentation based on deep cascading learning
Mo.Se.: Mosaic image segmentation based on deep cascading learning
<div class="page" title="Page 1"><div class="layoutArea"><div class="column"><p class="VARAbstract">Mosaic is an ancient type of art used to create decorati...
Greko-Romen Mozaiklerinde Lotus Çiçeği veya Nelumbo Nucifera
Greko-Romen Mozaiklerinde Lotus Çiçeği veya Nelumbo Nucifera
Numerous mosaics from the Hellenistic and imperial periods with Nilotic decoration have been recorded, both in the West and in the East. Almost all of them have a vegetal decoratio...
Another Alexander mosaic: reconstructing the Hunt mosaic from Palermo
Another Alexander mosaic: reconstructing the Hunt mosaic from Palermo
In 1904, a mosaic representing a hunt scene was found in House B in the Piazza della Vittoria in the centre of modern Palermo. The results of the excavations were published in 1921...
Parthicopolis’in Mozaik Döşemelerinin Tesserae Kökeni Üzerine (4.- 6. Yüzyıl). Yerel Ocaklar, Yataklar ve İthalat (2 Numaralı Bazilika’dan Bazı Mozaik Tesseralar için Ön Rapor)
Parthicopolis’in Mozaik Döşemelerinin Tesserae Kökeni Üzerine (4.- 6. Yüzyıl). Yerel Ocaklar, Yataklar ve İthalat (2 Numaralı Bazilika’dan Bazı Mozaik Tesseralar için Ön Rapor)
This article examines several mosaic tesserae from the mosaic panel of the exonarthex of the early Christian basilica No 2 in Parthicopolis (today’s town of Sandanski).The studied ...