Abstract:
Basic science is becoming ever more computationally intensive, increasing the need for large-scale compute and storage resources, be they within a High Performance Computer cluster, or more recently within the cloud. In most cases, large scale scientific computation is represented as a workflow for scheduling and runtime provisioning. Such scheduling become an even more challenging problem on cloud systems due to the dynamic nature of the cloud, in particular, the elasticity, the pricing models (both static and dynamic), the non-homogeneous resource types, the vast array of services, and virtualization. This mapping of workflow tasks on to a set of provisioned instances is an example of the general scheduling problem and is NP-complete. In addition, we also need to ensure that certain runtime constraints are met - the most typical being the cost of the computation and the time which that computation requires to complete. In this article, we introduce a new heuristic scheduling algorithm, Budget Deadline Aware Scheduling (BDAS), that addresses eScience workflow scheduling under budget and deadline constraints in Infrastructure as a Service (IaaS) clouds. The novelty of our work is satisfying both budget and deadline constraints while introducing a tunable cost-time trade off over heterogeneous instances. In addition, we study the stability and robustness of our algorithm by performing sensitivity analysis. The results demonstrate that overall BDAS finds a viable schedule for more than 40000 test cases accomplishing both defined constraints: budget and deadline. Moreover, our algorithm achieves a 17.0-23.8 percent higher success rate when compared to state of the art algorithms.