Javascript must be enabled to continue!
Distributed in-memory data management for workflow executions
View through CrossRef
Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS). Since such executions usually last for hours or days, some WMSs provide user steering support, i.e., they allow users to run data analyses and, depending on the results, adapt the workflows at runtime. A challenge in the parallel execution control design is to manage workflow data for efficient executions while enabling user steering support. Data access for high scalability is typically transaction-oriented, while for data analysis, it is online analytical-oriented so that managing such hybrid workloads makes the challenge even harder. In this work, we present SchalaDB, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering. We propose a distributed data design for scalable workflow task scheduling and high availability driven by a parallel and distributed in-memory DBMS. To evaluate our proposal, we develop d-Chiron, a WMS designed according to SchalaDB’s principles. We carry out an extensive experimental evaluation on an HPC cluster with up to 960 computing cores. Among other analyses, we show that even when running data analyses for user steering, SchalaDB’s overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data. Our results encourage workflow engine developers to follow a parallel and distributed data-oriented approach not only for scheduling and monitoring but also for user steering.
Title: Distributed in-memory data management for workflow executions
Description:
Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS).
Since such executions usually last for hours or days, some WMSs provide user steering support, i.
e.
, they allow users to run data analyses and, depending on the results, adapt the workflows at runtime.
A challenge in the parallel execution control design is to manage workflow data for efficient executions while enabling user steering support.
Data access for high scalability is typically transaction-oriented, while for data analysis, it is online analytical-oriented so that managing such hybrid workloads makes the challenge even harder.
In this work, we present SchalaDB, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering.
We propose a distributed data design for scalable workflow task scheduling and high availability driven by a parallel and distributed in-memory DBMS.
To evaluate our proposal, we develop d-Chiron, a WMS designed according to SchalaDB’s principles.
We carry out an extensive experimental evaluation on an HPC cluster with up to 960 computing cores.
Among other analyses, we show that even when running data analyses for user steering, SchalaDB’s overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data.
Our results encourage workflow engine developers to follow a parallel and distributed data-oriented approach not only for scheduling and monitoring but also for user steering.
Related Results
Capital Punishment
Capital Punishment
The first recorded execution on American soil was of Captain George Kendall, put to death in 1608 by firing squad. Since that time, there have been more than 15,000 known execution...
Partisan Punitive Practice in Varėna District (1944–1952)
Partisan Punitive Practice in Varėna District (1944–1952)
One of the least analysed aspects of the partisan war in Lithuania between 1944 and 1953 is the punitive practice of partisans. The aim of Total Irreversible Human Loss in Lithuani...
Optimizing Emergency Department Workflow Using Radio Frequency Identification Device (RFID) Data Analytics
Optimizing Emergency Department Workflow Using Radio Frequency Identification Device (RFID) Data Analytics
Emergency Department (ED) is a complex care delivery environment in a hospital that provides time sensitive urgent and lifesaving care [1]. Emergency medicine is an unscheduled pra...
A Novel Workflow Based on Core and Well-Log T1T2 NMR Measurements for Improved Field-Scale Assessment of Fluid Volume in Shale and Tight Reservoirs
A Novel Workflow Based on Core and Well-Log T1T2 NMR Measurements for Improved Field-Scale Assessment of Fluid Volume in Shale and Tight Reservoirs
Oil production in the US is increasingly dependent on shale and tight assets. However, there are still many challenges associated with the exploration and exploitation of these res...
EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud
EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud
AbstractA workflow is an effective way for modeling complex applications and serves as a means for scientists and researchers to better understand the details of applications. Clou...
Interoperability of Cross-organizational Workflows based on Process-view for Collaborative Product Development
Interoperability of Cross-organizational Workflows based on Process-view for Collaborative Product Development
Collaborative product development (CPD) has been widely accepted as an advanced collaboration paradigm that combines geographically distributed product development teams to develop...
CRGEM: Cellular Reprogramming using mechanism-driven Gene Expression Modulation
CRGEM: Cellular Reprogramming using mechanism-driven Gene Expression Modulation
Abstract
Introduction
Regenerative medicine promises a cure for currently incurable diseases and pathological conditions. Its c...
LIFE SKILL-BASED LEARNING MANAGEMENT AT STATE VOCATIONAL HIGH SCHOOL (SMKN) 3 SAMARINDA
LIFE SKILL-BASED LEARNING MANAGEMENT AT STATE VOCATIONAL HIGH SCHOOL (SMKN) 3 SAMARINDA
This research is based on the following problems: (1) How can life skills-based learning management improve the quality of graduates of SMKN 3 Samarinda? (2) What is the role of mo...

