Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Adaptive Workflow Scheduling in Heterogeneous GPU Clusters via Deep Reinforcement Learning

View through CrossRef
The proliferation of heterogeneous Graphics Processing Unit (GPU) clusters has introduced unprecedented computational capabilities for workflow execution across diverse scientific and industrial domains. However, the inherent heterogeneity of GPU resources, coupled with dynamic workload characteristics and complex workflow dependencies, presents substantial challenges for efficient scheduling. Traditional heuristic-based scheduling algorithms such as Heterogeneous Earliest Finish Time (HEFT) and First-In-First-Out with Duplication and Earliest Finish Time (FIFO-DEFT) often fail to adapt to rapidly changing cluster states and evolving workload patterns. This paper proposes an adaptive workflow scheduling framework leveraging Deep Reinforcement Learning (DRL) to intelligently allocate workflow tasks to heterogeneous GPU resources. The proposed approach employs a Deep Q-Network (DQN) architecture integrated with prioritized experience replay to learn optimal scheduling policies through continuous interaction with the cluster environment. The framework models workflow scheduling as a Markov Decision Process (MDP) where the agent learns to minimize makespan, maximize resource utilization, and maintain quality-of-service guarantees. Extensive experimental evaluations demonstrate that the DRL-based scheduler achieves significant performance improvements compared to baseline algorithms including HEFT, FIFO-DEFT, and other state-of-the-art schedulers. The proposed method exhibits superior adaptability to varying cluster configurations and workflow characteristics, maintaining robust performance across diverse execution scenarios while reducing average makespan and improving scheduling length ratio metrics.
Title: Adaptive Workflow Scheduling in Heterogeneous GPU Clusters via Deep Reinforcement Learning
Description:
The proliferation of heterogeneous Graphics Processing Unit (GPU) clusters has introduced unprecedented computational capabilities for workflow execution across diverse scientific and industrial domains.
However, the inherent heterogeneity of GPU resources, coupled with dynamic workload characteristics and complex workflow dependencies, presents substantial challenges for efficient scheduling.
Traditional heuristic-based scheduling algorithms such as Heterogeneous Earliest Finish Time (HEFT) and First-In-First-Out with Duplication and Earliest Finish Time (FIFO-DEFT) often fail to adapt to rapidly changing cluster states and evolving workload patterns.
This paper proposes an adaptive workflow scheduling framework leveraging Deep Reinforcement Learning (DRL) to intelligently allocate workflow tasks to heterogeneous GPU resources.
The proposed approach employs a Deep Q-Network (DQN) architecture integrated with prioritized experience replay to learn optimal scheduling policies through continuous interaction with the cluster environment.
The framework models workflow scheduling as a Markov Decision Process (MDP) where the agent learns to minimize makespan, maximize resource utilization, and maintain quality-of-service guarantees.
Extensive experimental evaluations demonstrate that the DRL-based scheduler achieves significant performance improvements compared to baseline algorithms including HEFT, FIFO-DEFT, and other state-of-the-art schedulers.
The proposed method exhibits superior adaptability to varying cluster configurations and workflow characteristics, maintaining robust performance across diverse execution scenarios while reducing average makespan and improving scheduling length ratio metrics.

Related Results

Learning Approaches to Dynamic Workflow Scheduling based on Genetic Programming and Deep Reinforcement Learning
Learning Approaches to Dynamic Workflow Scheduling based on Genetic Programming and Deep Reinforcement Learning
<p><strong>Dynamic workflow scheduling (DWS) in cloud computing is a critical yet challenging problem, involving assigning numerous workflow tasks to heterogeneous virt...
EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud
EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud
AbstractA workflow is an effective way for modeling complex applications and serves as a means for scientists and researchers to better understand the details of applications. Clou...
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
AbstractAutoDock Vina and its derivatives have established themselves as a prevailing pipeline for virtual screening in contemporary drug discovery. Our Vina-GPU method leverages t...
Visual versus Tabular Scheduling Programs
Visual versus Tabular Scheduling Programs
Effective scheduling in construction is crucial for ensuring timely project completion and maintaining budget control. Scheduling programs play an important role in this process by...
Adaptive Scheduling of Mixing Trucks in Construction Sites with an Improved Deep Q-Network
Adaptive Scheduling of Mixing Trucks in Construction Sites with an Improved Deep Q-Network
The management of concrete mixing station distribution is evolving toward more intelligent and efficient methods. Additionally, in the context of the group operations of commodity ...
Vina-GPU 2.0:further accelerating AutoDock Vina and its derivatives with GPUs
Vina-GPU 2.0:further accelerating AutoDock Vina and its derivatives with GPUs
Modern drug discovery typically faces large virtual screens from huge compound databases where multiple docking tools are involved for meeting various real scenes or improving the ...
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
The Effect of Compression Reinforcement on the Shear Behavior of Concrete Beams with Hybrid Reinforcement
Abstract This study examines the impact of steel compression reinforcement on the shear behavior of concrete beams reinforced with glass fiber reinforced polymer (GFRP) bar...

Back to Top