Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition

View through CrossRef
The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a significant challenge in achieving stable and high-performance cooperative behavior. Addressing the need for the balanced exploration and exploitation of multi-agent ant robots within a partially observable continuous action space, this study introduces a multi-agent centralized strategy gradient algorithm grounded in a local state transition mechanism. In order to solve this challenge, the algorithm learns local state and local state-action representation from local observations and action values, thereby establishing a “local state transition” mechanism autonomously. As the input of the actor network, the automatically extracted local observation representation reduces the input state dimension, enhances the local state features closely related to the local state transition, and promotes the agent to use the local state features that affect the next observation state. To mitigate non-stationarity and reliability assignment issues in multi-agent environments, a centralized critic network evaluates the current joint strategy. The proposed algorithm, NST-FACMAC, is evaluated alongside other multi-agent deterministic strategy algorithms in a continuous control simulation environment using a multi-agent ant robot. The experimental results indicate accelerated convergence and higher average reward values in cooperative multi-agent ant simulation environments. Notably, in four simulated environments named Ant-v2 (2 × 4), Ant-v2 (2 × 4d), Ant-v2 (4 × 2), and Manyant (2 × 3), the algorithm demonstrates performance improvements of approximately 1.9%, 4.8%, 11.9%, and 36.1%, respectively, compared to the best baseline algorithm. These findings underscore the algorithm’s effectiveness in enhancing the stability of multi-agent ant robot control within dynamic environments.
Title: A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition
Description:
The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a significant challenge in achieving stable and high-performance cooperative behavior.
Addressing the need for the balanced exploration and exploitation of multi-agent ant robots within a partially observable continuous action space, this study introduces a multi-agent centralized strategy gradient algorithm grounded in a local state transition mechanism.
In order to solve this challenge, the algorithm learns local state and local state-action representation from local observations and action values, thereby establishing a “local state transition” mechanism autonomously.
As the input of the actor network, the automatically extracted local observation representation reduces the input state dimension, enhances the local state features closely related to the local state transition, and promotes the agent to use the local state features that affect the next observation state.
To mitigate non-stationarity and reliability assignment issues in multi-agent environments, a centralized critic network evaluates the current joint strategy.
The proposed algorithm, NST-FACMAC, is evaluated alongside other multi-agent deterministic strategy algorithms in a continuous control simulation environment using a multi-agent ant robot.
The experimental results indicate accelerated convergence and higher average reward values in cooperative multi-agent ant simulation environments.
Notably, in four simulated environments named Ant-v2 (2 × 4), Ant-v2 (2 × 4d), Ant-v2 (4 × 2), and Manyant (2 × 3), the algorithm demonstrates performance improvements of approximately 1.
9%, 4.
8%, 11.
9%, and 36.
1%, respectively, compared to the best baseline algorithm.
These findings underscore the algorithm’s effectiveness in enhancing the stability of multi-agent ant robot control within dynamic environments.

Related Results

Fertility Transition Across Major Sub-Saharan African Cities: The Role of Proximate Determinants
Fertility Transition Across Major Sub-Saharan African Cities: The Role of Proximate Determinants
Abstract Background Sub-Saharan Africa’s fertility transition has lagged behind other regions despite rapid urbanization, resulting in persistently high fertility rates. S...
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
Abstract This study examines the impact of steel compression reinforcement on the shear behavior of concrete beams reinforced with glass fiber reinforced polymer (GFRP) bar...
APPLICATION OF INTELLIGENT MULTIAGENT APPROACH TO LYME DISEASE SIMULATION
APPLICATION OF INTELLIGENT MULTIAGENT APPROACH TO LYME DISEASE SIMULATION
ObjectiveThe objective of this research is to develop the model for calculating the forecast of the Lyme disease dynamics what will help to take effective preventive and control me...
Study on Scheme Optimization of bridge reinforcement increasing ratio
Study on Scheme Optimization of bridge reinforcement increasing ratio
Abstract The bridge reinforcement methods, each method has its advantages and disadvantages. The load-bearing capacity of bridge members is controlled by the ultimat...
QiMARL: Quantum-Inspired Multi-Agent Reinforcement Learning Strategy for Efficient Resource Energy Distribution in Nodal Power Stations
QiMARL: Quantum-Inspired Multi-Agent Reinforcement Learning Strategy for Efficient Resource Energy Distribution in Nodal Power Stations
The coupling of quantum computing with multi-agent reinforcement learning (MARL) provides an exciting direction to tackle intricate decision-making tasks in high-dimensional spaces...
Automation of aeronautical information processing based on multi-agent technologies
Automation of aeronautical information processing based on multi-agent technologies
Progress in the development of computer engineering provides an opportunity to address a wider variety of challenges using computer software systems. The task of automatic aeronaut...
Reinforcement Learning Based Decision Support Tool For Epidemic Control
Reinforcement Learning Based Decision Support Tool For Epidemic Control
Rationale: Covid-19 Is Certainly One Of The Worst Pandemics Ever. In The Absence Of A Vaccine, Classical Epidemiological Measures Such As Testing In Order To Isolate The Infected P...
Reinforcement Learning: Theory and Applications in HEMS
Reinforcement Learning: Theory and Applications in HEMS
The twin capabilities of learning from experience and learning at higher levels of abstraction, set reinforcement learning apart from other areas of machine learning and (within th...

Back to Top