Javascript must be enabled to continue!

A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition

The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a significant challenge in achieving stable and high-performance cooperative behavior. Addressing the need for the balanced exploration and exploitation of multi-agent ant robots within a partially observable continuous action space, this study introduces a multi-agent centralized strategy gradient algorithm grounded in a local state transition mechanism. In order to solve this challenge, the algorithm learns local state and local state-action representation from local observations and action values, thereby establishing a “local state transition” mechanism autonomously. As the input of the actor network, the automatically extracted local observation representation reduces the input state dimension, enhances the local state features closely related to the local state transition, and promotes the agent to use the local state features that affect the next observation state. To mitigate non-stationarity and reliability assignment issues in multi-agent environments, a centralized critic network evaluates the current joint strategy. The proposed algorithm, NST-FACMAC, is evaluated alongside other multi-agent deterministic strategy algorithms in a continuous control simulation environment using a multi-agent ant robot. The experimental results indicate accelerated convergence and higher average reward values in cooperative multi-agent ant simulation environments. Notably, in four simulated environments named Ant-v2 (2 × 4), Ant-v2 (2 × 4d), Ant-v2 (4 × 2), and Manyant (2 × 3), the algorithm demonstrates performance improvements of approximately 1.9%, 4.8%, 11.9%, and 36.1%, respectively, compared to the best baseline algorithm. These findings underscore the algorithm’s effectiveness in enhancing the stability of multi-agent ant robot control within dynamic environments.

MDPI AG

Lei Sheng Honghui Chen Xiliang Chen

Algorithms

2024

Title: A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition

Description:

Addressing the need for the balanced exploration and exploitation of multi-agent ant robots within a partially observable continuous action space, this study introduces a multi-agent centralized strategy gradient algorithm grounded in a local state transition mechanism.

In order to solve this challenge, the algorithm learns local state and local state-action representation from local observations and action values, thereby establishing a “local state transition” mechanism autonomously.

As the input of the actor network, the automatically extracted local observation representation reduces the input state dimension, enhances the local state features closely related to the local state transition, and promotes the agent to use the local state features that affect the next observation state.

To mitigate non-stationarity and reliability assignment issues in multi-agent environments, a centralized critic network evaluates the current joint strategy.

The proposed algorithm, NST-FACMAC, is evaluated alongside other multi-agent deterministic strategy algorithms in a continuous control simulation environment using a multi-agent ant robot.

The experimental results indicate accelerated convergence and higher average reward values in cooperative multi-agent ant simulation environments.

Notably, in four simulated environments named Ant-v2 (2 × 4), Ant-v2 (2 × 4d), Ant-v2 (4 × 2), and Manyant (2 × 3), the algorithm demonstrates performance improvements of approximately 1.

9%, 4.

8%, 11.

9%, and 36.

1%, respectively, compared to the best baseline algorithm.

These findings underscore the algorithm’s effectiveness in enhancing the stability of multi-agent ant robot control within dynamic environments.

Back

Abstract Background Sub-Saharan Africa’s fertility transition has lagged behind other regions despite rapid urbanization, resulting in persistently high fertility rates. S...

The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement

Abstract This study examines the impact of steel compression reinforcement on the shear behavior of concrete beams reinforced with glass fiber reinforced polymer (GFRP) bar...

APPLICATION OF INTELLIGENT MULTIAGENT APPROACH TO LYME DISEASE SIMULATION

ObjectiveThe objective of this research is to develop the model for calculating the forecast of the Lyme disease dynamics what will help to take effective preventive and control me...

Study on Scheme Optimization of bridge reinforcement increasing ratio

Abstract The bridge reinforcement methods, each method has its advantages and disadvantages. The load-bearing capacity of bridge members is controlled by the ultimat...

QiMARL: Quantum-Inspired Multi-Agent Reinforcement Learning Strategy for Efficient Resource Energy Distribution in Nodal Power Stations

The coupling of quantum computing with multi-agent reinforcement learning (MARL) provides an exciting direction to tackle intricate decision-making tasks in high-dimensional spaces...

Automation of aeronautical information processing based on multi-agent technologies

Progress in the development of computer engineering provides an opportunity to address a wider variety of challenges using computer software systems. The task of automatic aeronaut...

Reinforcement Learning Based Decision Support Tool For Epidemic Control

Rationale: Covid-19 Is Certainly One Of The Worst Pandemics Ever. In The Absence Of A Vaccine, Classical Epidemiological Measures Such As Testing In Order To Isolate The Infected P...

Reinforcement Learning: Theory and Applications in HEMS

The twin capabilities of learning from experience and learning at higher levels of abstraction, set reinforcement learning apart from other areas of machine learning and (within th...

Email:
Password:

Email:

A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition

Related Results