Javascript must be enabled to continue!

Sojourn Time Minimization of Successful Jobs

Due to a growing interest in deep learning applications [5], compute-intensive and long-running (hours to days) training jobs have become a significant component of datacenter workloads. A large fraction of these jobs is often exploratory, with the goal of determining the best model structure (e.g., the number of layers and channels in a convolutional neural network), hyperparameters (e.g., the learning rate), and data augmentation strategies for the target application. Notably, training jobs are often terminated early if their learning metrics (e.g., training and validation accuracy) are not converging, with only a few completing successfully. For this motivating application, we consider the problem of scheduling a set of jobs that can be terminated at predetermined checkpoints with known probabilities estimated from historical data. We prove that, in order to minimize the time to complete the first K successful jobs on a single server, optimal scheduling does not require preemption (even when preemption overhead is negligible) and provide an optimal policy; advantages of this policy are quantified through simulation. Related Work. While job scheduling has been investigated extensively in many scenarios (see [6] and [2] for a survey of recent result), most policies require that the cost of waiting times of each job be known at scheduling time; in contrast, in our setting the scheduler does not know which job will be the K-th successful job, and sojourn times of subsequent jobs do not contribute to the target metric. For example, [4, 3] minimize makespan (i.e., the time to complete all jobs) for known execution times and waiting time costs; similarly, Gittins index [1] and SR rank [7] minimize expected sojourn time of all jobs, i.e., both successfully completed jobs and jobs terminated early. Unfortunately, scheduling policies not distinguishing between these two types of jobs may favor jobs where the next stage is short and leads to early termination with high probability, which is an undesirable outcome in our applications of interest.

Association for Computing Machinery (ACM)

Yuan Yao Marco Paolieri Leana Golubchik

ACM SIGMETRICS Performance Evaluation Review

2022

Title: Sojourn Time Minimization of Successful Jobs

Description:

Due to a growing interest in deep learning applications [5], compute-intensive and long-running (hours to days) training jobs have become a significant component of datacenter workloads.

A large fraction of these jobs is often exploratory, with the goal of determining the best model structure (e.

, the number of layers and channels in a convolutional neural network), hyperparameters (e.

, the learning rate), and data augmentation strategies for the target application.

Notably, training jobs are often terminated early if their learning metrics (e.

, training and validation accuracy) are not converging, with only a few completing successfully.

For this motivating application, we consider the problem of scheduling a set of jobs that can be terminated at predetermined checkpoints with known probabilities estimated from historical data.

We prove that, in order to minimize the time to complete the first K successful jobs on a single server, optimal scheduling does not require preemption (even when preemption overhead is negligible) and provide an optimal policy; advantages of this policy are quantified through simulation.

Related Work.

While job scheduling has been investigated extensively in many scenarios (see [6] and [2] for a survey of recent result), most policies require that the cost of waiting times of each job be known at scheduling time; in contrast, in our setting the scheduler does not know which job will be the K-th successful job, and sojourn times of subsequent jobs do not contribute to the target metric.

For example, [4, 3] minimize makespan (i.

, the time to complete all jobs) for known execution times and waiting time costs; similarly, Gittins index [1] and SR rank [7] minimize expected sojourn time of all jobs, i.

, both successfully completed jobs and jobs terminated early.

Unfortunately, scheduling policies not distinguishing between these two types of jobs may favor jobs where the next stage is short and leads to early termination with high probability, which is an undesirable outcome in our applications of interest.

Back

This rapid literature review examines evidence on interventions have been used to create green jobs in developing countries. The ‘green jobs’ concept does not have a singular and u...

The mechanisms of minimization: How interrogation tactics suggest lenient sentencing through pragmatic implication

Objective: Minimization is a legal interrogation tactic in which an interrogator attempts to decrease a suspect's resistance to confessing by, for example, downplaying the seriousn...

Development of Automatic Recognition and Recording System for Rig Jobs

Abstract Keeping a record of drilling process automatically in real time is a requisite for KPI studies and getting support from ROCs experts. There are five categor...

Jobs and skills for adaptation and resilience in Scotland

Although there is awareness of the ‘green jobs’ opportunity associated with climate mitigation and especially energy efficiency in the built environment, understanding of the poten...

Estimating age group-dependent sensitivity and mean sojourn time in colorectal cancer screening

Objective In evaluating the efficacy of cancer screening programmes, sojourn time (duration of the preclinical detectable phase) and sensitivity of the screening test are the two k...

Natural History of Type 2 Diabetes in Indians: Time to Progression

Objective: To describe the natural history of diabetes in Indians.Research Design and Methods: Data are from CARRS longitudinal stu...

Natural History of Type 2 Diabetes in Indians: Time to Progression

Objective: To describe the natural history of diabetes in Indians.Research Design and Methods: Data are from CARRS longitudinal stu...

Scheduling with Calibrations for Multi-Interval Jobs

This paper studies a scheduling problem with machine calibrations for multi-interval jobs. More exactly, there are n (possibly weighted) jobs of unit size that must be scheduled on...

Email:
Password:

Email:

Sojourn Time Minimization of Successful Jobs

Related Results