Javascript must be enabled to continue!

MOSAIC for Multiple-Reward Environments

Reinforcement learning (RL) can provide a basic framework for autonomous robots to learn to control and maximize future cumulative rewards in complex environments. To achieve high performance, RL controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands. For example, a robot playing tennis and squash needs to cope with the different dynamics of a tennis or squash racket and such dynamic environmental factors as the wind. In addition, this robot has to tailor its tactics simultaneously under the rules of either game. This double complexity of the external dynamics and reward function sometimes becomes more complex when both the multiple dynamics and multiple reward functions switch implicitly, as in the situation of a real (multi-agent) game of tennis where one player cannot observe the intention of her opponents or her partner. The robot must consider its opponent's and its partner's unobservable behavioral goals (reward function). In this article, we address how an RL agent should be designed to handle such double complexity of dynamics and reward. We have previously proposed modular selection and identification for control (MOSAIC) to cope with nonstationary dynamics where appropriate controllers are selected and learned among many candidates based on the error of its paired dynamics predictor: the forward model. Here we extend this framework for RL and propose MOSAIC-MR architecture. It resembles MOSAIC in spirit and selects and learns an appropriate RL controller based on the RL controller's TD error using the errors of the dynamics (the forward model) and the reward predictors. Furthermore, unlike other MOSAIC variants for RL, RL controllers are not a priori paired with the fixed predictors of dynamics and rewards. The simulation results demonstrate that MOSAIC-MR outperforms other counterparts because of this flexible association ability among RL controllers, forward models, and reward predictors.

MIT Press - Journals

Norikazu Sugimoto Masahiko Haruno Kenji Doya Mitsuo Kawato

Neural Computation

2011

Title: MOSAIC for Multiple-Reward Environments

Description:

Reinforcement learning (RL) can provide a basic framework for autonomous robots to learn to control and maximize future cumulative rewards in complex environments.

To achieve high performance, RL controllers must consider the complex external dynamics for movements and task (reward function) and optimize control commands.

For example, a robot playing tennis and squash needs to cope with the different dynamics of a tennis or squash racket and such dynamic environmental factors as the wind.

In addition, this robot has to tailor its tactics simultaneously under the rules of either game.

This double complexity of the external dynamics and reward function sometimes becomes more complex when both the multiple dynamics and multiple reward functions switch implicitly, as in the situation of a real (multi-agent) game of tennis where one player cannot observe the intention of her opponents or her partner.

The robot must consider its opponent's and its partner's unobservable behavioral goals (reward function).

In this article, we address how an RL agent should be designed to handle such double complexity of dynamics and reward.

We have previously proposed modular selection and identification for control (MOSAIC) to cope with nonstationary dynamics where appropriate controllers are selected and learned among many candidates based on the error of its paired dynamics predictor: the forward model.

Here we extend this framework for RL and propose MOSAIC-MR architecture.

It resembles MOSAIC in spirit and selects and learns an appropriate RL controller based on the RL controller's TD error using the errors of the dynamics (the forward model) and the reward predictors.

Furthermore, unlike other MOSAIC variants for RL, RL controllers are not a priori paired with the fixed predictors of dynamics and rewards.

The simulation results demonstrate that MOSAIC-MR outperforms other counterparts because of this flexible association ability among RL controllers, forward models, and reward predictors.

Back

Related Results

The Ganymede Mosaic of Claudiopolis

Claudiopolis (Bolu) was a prominent city in Bithynia during the Ancient Period. The Ganymede mosaic was discovered during a rescue excavation at the city center in 2011. The Ganyme...

A BUILDING WITH MOSAIC IN THE PATARA HARBOR STREET

This article is about a mosaic uncovered during the excavations in chamber II on the westportico of the Harbor Street of Patara that connects the city center to the harbor. The mos...

Trakya’daki Philippopolis ve Augusta Traiana’dan Geç 4. - Erken 5. Yüzyıla Tarihlenen Mozaik Döşemeler

The present paper deals with the mosaic pavements that embellished the public buildings, semi-public and private houses between the 80s of 4th c. and the first two decades of 5th c...

Mo.Se.: Mosaic image segmentation based on deep cascading learning

<div class="page" title="Page 1"><div class="layoutArea"><div class="column"><p class="VARAbstract">Mosaic is an ancient type of art used to create decorati...

Greko-Romen Mozaiklerinde Lotus Çiçeği veya Nelumbo Nucifera

Numerous mosaics from the Hellenistic and imperial periods with Nilotic decoration have been recorded, both in the West and in the East. Almost all of them have a vegetal decoratio...

Another Alexander mosaic: reconstructing the Hunt mosaic from Palermo

In 1904, a mosaic representing a hunt scene was found in House B in the Piazza della Vittoria in the centre of modern Palermo. The results of the excavations were published in 1921...

Parthicopolis’in Mozaik Döşemelerinin Tesserae Kökeni Üzerine (4.- 6. Yüzyıl). Yerel Ocaklar, Yataklar ve İthalat (2 Numaralı Bazilika’dan Bazı Mozaik Tesseralar için Ön Rapor)

This article examines several mosaic tesserae from the mosaic panel of the exonarthex of the early Christian basilica No 2 in Parthicopolis (today’s town of Sandanski).The studied ...

The agonistic mosaic in the Villa of Lucius Verus and the Capitolia of Rome

The 3rd-c. interventions in the Villa of Lucius Verus on the Via Cassia included the laying of a black-and-white mosaic in the irregular space 32 in Sector D, which forms a passage...

Recent Results

Inkwell

Glazed earthenware gilt bronze...

Frye's Bible

There are advantages to writing a review that will appear several months after the book one is discussing has been published. One can assume that the latter will have been read by ...

Literary Responses to the Mughal Imperium: The Historical Poems of Keśavdās

The penetration of Mughal power into previously autonomous regional kingdoms produced significant political, but also literary effects. In this article, I trace the advent of the M...

Laurie Barmore, I Found it on the Beach - Contemporary Abstract Painting with Beautiful Movement & Vivid Color (Blue + Orange + White) (2021)

Acrylic on canvas, 40 × 30 in...

Email:
Password:

Email: