Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

MOSAIC for Multiple-Reward Environments

View through CrossRef
Reinforcement learning (RL) can provide a basic framework for autonomous robots to learn to control and maximize future cumulative rewards in complex environments. To achieve high performance, RL controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands. For example, a robot playing tennis and squash needs to cope with the different dynamics of a tennis or squash racket and such dynamic environmental factors as the wind. In addition, this robot has to tailor its tactics simultaneously under the rules of either game. This double complexity of the external dynamics and reward function sometimes becomes more complex when both the multiple dynamics and multiple reward functions switch implicitly, as in the situation of a real (multi-agent) game of tennis where one player cannot observe the intention of her opponents or her partner. The robot must consider its opponent's and its partner's unobservable behavioral goals (reward function). In this article, we address how an RL agent should be designed to handle such double complexity of dynamics and reward. We have previously proposed modular selection and identification for control (MOSAIC) to cope with nonstationary dynamics where appropriate controllers are selected and learned among many candidates based on the error of its paired dynamics predictor: the forward model. Here we extend this framework for RL and propose MOSAIC-MR architecture. It resembles MOSAIC in spirit and selects and learns an appropriate RL controller based on the RL controller's TD error using the errors of the dynamics (the forward model) and the reward predictors. Furthermore, unlike other MOSAIC variants for RL, RL controllers are not a priori paired with the fixed predictors of dynamics and rewards. The simulation results demonstrate that MOSAIC-MR outperforms other counterparts because of this flexible association ability among RL controllers, forward models, and reward predictors.
Title: MOSAIC for Multiple-Reward Environments
Description:
Reinforcement learning (RL) can provide a basic framework for autonomous robots to learn to control and maximize future cumulative rewards in complex environments.
To achieve high performance, RL controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands.
For example, a robot playing tennis and squash needs to cope with the different dynamics of a tennis or squash racket and such dynamic environmental factors as the wind.
In addition, this robot has to tailor its tactics simultaneously under the rules of either game.
This double complexity of the external dynamics and reward function sometimes becomes more complex when both the multiple dynamics and multiple reward functions switch implicitly, as in the situation of a real (multi-agent) game of tennis where one player cannot observe the intention of her opponents or her partner.
The robot must consider its opponent's and its partner's unobservable behavioral goals (reward function).
In this article, we address how an RL agent should be designed to handle such double complexity of dynamics and reward.
We have previously proposed modular selection and identification for control (MOSAIC) to cope with nonstationary dynamics where appropriate controllers are selected and learned among many candidates based on the error of its paired dynamics predictor: the forward model.
Here we extend this framework for RL and propose MOSAIC-MR architecture.
It resembles MOSAIC in spirit and selects and learns an appropriate RL controller based on the RL controller's TD error using the errors of the dynamics (the forward model) and the reward predictors.
Furthermore, unlike other MOSAIC variants for RL, RL controllers are not a priori paired with the fixed predictors of dynamics and rewards.
The simulation results demonstrate that MOSAIC-MR outperforms other counterparts because of this flexible association ability among RL controllers, forward models, and reward predictors.

Related Results

The Ganymede Mosaic of Claudiopolis
The Ganymede Mosaic of Claudiopolis
Claudiopolis (Bolu) was a prominent city in Bithynia during the Ancient Period. The Ganymede mosaic was discovered during a rescue excavation at the city center in 2011. The Ganyme...
A BUILDING WITH MOSAIC IN THE PATARA HARBOR STREET
A BUILDING WITH MOSAIC IN THE PATARA HARBOR STREET
This article is about a mosaic uncovered during the excavations in chamber II on the westportico of the Harbor Street of Patara that connects the city center to the harbor. The mos...
Trakya’daki Philippopolis ve Augusta Traiana’dan Geç 4. - Erken 5. Yüzyıla Tarihlenen Mozaik Döşemeler
Trakya’daki Philippopolis ve Augusta Traiana’dan Geç 4. - Erken 5. Yüzyıla Tarihlenen Mozaik Döşemeler
The present paper deals with the mosaic pavements that embellished the public buildings, semi-public and private houses between the 80s of 4th c. and the first two decades of 5th c...
Mo.Se.: Mosaic image segmentation based on deep cascading learning
Mo.Se.: Mosaic image segmentation based on deep cascading learning
<div class="page" title="Page 1"><div class="layoutArea"><div class="column"><p class="VARAbstract">Mosaic is an ancient type of art used to create decorati...
Greko-Romen Mozaiklerinde Lotus Çiçeği veya Nelumbo Nucifera
Greko-Romen Mozaiklerinde Lotus Çiçeği veya Nelumbo Nucifera
Numerous mosaics from the Hellenistic and imperial periods with Nilotic decoration have been recorded, both in the West and in the East. Almost all of them have a vegetal decoratio...
Another Alexander mosaic: reconstructing the Hunt mosaic from Palermo
Another Alexander mosaic: reconstructing the Hunt mosaic from Palermo
In 1904, a mosaic representing a hunt scene was found in House B in the Piazza della Vittoria in the centre of modern Palermo. The results of the excavations were published in 1921...
The agonistic mosaic in the Villa of Lucius Verus and the Capitolia of Rome
The agonistic mosaic in the Villa of Lucius Verus and the Capitolia of Rome
The 3rd-c. interventions in the Villa of Lucius Verus on the Via Cassia included the laying of a black-and-white mosaic in the irregular space 32 in Sector D, which forms a passage...

Recent Results

Inkwell
Inkwell
Glazed earthenware gilt bronze...
Frye's Bible
Frye's Bible
There are advantages to writing a review that will appear several months after the book one is discussing has been published. One can assume that the latter will have been read by ...
Literary Responses to the Mughal Imperium: The Historical Poems of Keśavdās
Literary Responses to the Mughal Imperium: The Historical Poems of Keśavdās
The penetration of Mughal power into previously autonomous regional kingdoms produced significant political, but also literary effects. In this article, I trace the advent of the M...

Back to Top