Workflow scheduling with fault tolerance

Laiping Zhao, Kouichi Sakurai

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter describes a study on workflow scheduling with fault tolerance. It starts with an understanding on workflow scheduling and fault tolerance technologies independently. Next, the chapter surveys the related works on the combination field of workflow scheduling and fault tolerance technologies. Generally, these works are classified into six categories corresponding to the six fault tolerance technologies: workflow scheduling with primary/backup, primary/backup with multiple backups, checkpoint, rescheduling, active replication, and active replication with dynamic replicas. An in-depth study on these six topics illustrates the challenge issues explored so far, e.g. overloading conditions, tradeoffs among scheduling criteria, et cetera, and some future research directions are also identified. As applications are increasingly complex, and failures become a severe problem in the large scale systems, the authors expect to provide a comprehensive review on the problem of workflow scheduling with fault tolerance through this work.

Original languageEnglish
Title of host publicationNetwork and Traffic Engineering in Emerging Distributed Computing Applications
PublisherIGI Global
Pages94-123
Number of pages30
ISBN (Print)9781466618886
DOIs
Publication statusPublished - 2012

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'Workflow scheduling with fault tolerance'. Together they form a unique fingerprint.

Cite this