Efficient big data analysis is critical to support applications or services in Internet of Things (IoT) system, especially for the time-intensive services. Hence, the data center may host heterogeneous big data analysis tasks for multiple IoT systems. It is a challenging problem since the data centers usually need to schedule a large number of periodic or online tasks in a short time. In this paper, we investigate the heterogeneous task scheduling problem to reduce the global task execution time, which is also an efficient method to reduce energy consumption for data centers. We establish the task execution for heterogeneous tasks respectively based on the data locality feature, which also indicate the relationship among the tasks, data blocks and servers. We propose a heterogeneous task scheduling algorithm with data migration. The core idea of the algorithm is to maximize the efficiency by comparing the cost between remote task execution and data migration, which could improve the data locality and reduce task execution time. We conduct extensive simulations and the experimental results show that our algorithm has better performance than the traditional methods, and data migration actually works to reduce the overall task execution time. The algorithm also shows acceptable fairness for the heterogeneous tasks.

Data-Intensive Task Scheduling for Heterogeneous Big Data Analytics in IoT System

Pau, Giovanni;
2020-01-01

Abstract

Efficient big data analysis is critical to support applications or services in Internet of Things (IoT) system, especially for the time-intensive services. Hence, the data center may host heterogeneous big data analysis tasks for multiple IoT systems. It is a challenging problem since the data centers usually need to schedule a large number of periodic or online tasks in a short time. In this paper, we investigate the heterogeneous task scheduling problem to reduce the global task execution time, which is also an efficient method to reduce energy consumption for data centers. We establish the task execution for heterogeneous tasks respectively based on the data locality feature, which also indicate the relationship among the tasks, data blocks and servers. We propose a heterogeneous task scheduling algorithm with data migration. The core idea of the algorithm is to maximize the efficiency by comparing the cost between remote task execution and data migration, which could improve the data locality and reduce task execution time. We conduct extensive simulations and the experimental results show that our algorithm has better performance than the traditional methods, and data migration actually works to reduce the overall task execution time. The algorithm also shows acceptable fairness for the heterogeneous tasks.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11387/140943
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact