TY - CHAP
T1 - Workflow scheduling with fault tolerance
AU - Zhao, Laiping
AU - Sakurai, Kouichi
PY - 2012
Y1 - 2012
N2 - This chapter describes a study on workflow scheduling with fault tolerance. It starts with an understanding on workflow scheduling and fault tolerance technologies independently. Next, the chapter surveys the related works on the combination field of workflow scheduling and fault tolerance technologies. Generally, these works are classified into six categories corresponding to the six fault tolerance technologies: workflow scheduling with primary/backup, primary/backup with multiple backups, checkpoint, rescheduling, active replication, and active replication with dynamic replicas. An in-depth study on these six topics illustrates the challenge issues explored so far, e.g. overloading conditions, tradeoffs among scheduling criteria, et cetera, and some future research directions are also identified. As applications are increasingly complex, and failures become a severe problem in the large scale systems, the authors expect to provide a comprehensive review on the problem of workflow scheduling with fault tolerance through this work.
AB - This chapter describes a study on workflow scheduling with fault tolerance. It starts with an understanding on workflow scheduling and fault tolerance technologies independently. Next, the chapter surveys the related works on the combination field of workflow scheduling and fault tolerance technologies. Generally, these works are classified into six categories corresponding to the six fault tolerance technologies: workflow scheduling with primary/backup, primary/backup with multiple backups, checkpoint, rescheduling, active replication, and active replication with dynamic replicas. An in-depth study on these six topics illustrates the challenge issues explored so far, e.g. overloading conditions, tradeoffs among scheduling criteria, et cetera, and some future research directions are also identified. As applications are increasingly complex, and failures become a severe problem in the large scale systems, the authors expect to provide a comprehensive review on the problem of workflow scheduling with fault tolerance through this work.
UR - http://www.scopus.com/inward/record.url?scp=84898250258&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84898250258&partnerID=8YFLogxK
U2 - 10.4018/978-1-4666-1888-6.ch005
DO - 10.4018/978-1-4666-1888-6.ch005
M3 - Chapter
AN - SCOPUS:84898250258
SN - 9781466618886
SP - 94
EP - 123
BT - Network and Traffic Engineering in Emerging Distributed Computing Applications
PB - IGI Global
ER -