Cristiano Costa

Universidade do Vale do Rio dos Sinos Brazil

1chapters authored

Chapters authored

Looking at Data Science through the Lens of Scheduling and Load Balancing

By Diórgenes Eugênio da Silveira, Eduardo Souza dos Reis, Rodrigo Simon Bavaresco, Marcio Miguel Gomes, Cristiano André da Costa, Jorge Luis Victoria Barbosa, Rodolfo Stoffel Antunes, Alvaro Machado Júnior, Rodrigo Saad and Rodrigo da Rosa Righi

The growth in data generated by private and public organizations leads to several opportunities to obtain valuable knowledge. In this scenario, data science becomes pertinent to define a structured methodology to extract valuable knowledge from raw data. It encompasses a heterogeneous group of techniques that challenge the implementation of a single platform capable of incorporating all the available resources. Thus, it is necessary to formulate a data science workflow based on different tools to extract knowledge from massive datasets. In this context, high-performance computing (HPC) provides the infrastructure required to optimize the processing time of data science workflows, which become a collection of tasks that must be efficiently scheduled to provide results in acceptable time intervals. While few studies explore the use of HPC for data science tasks, in the best of our knowledge, none conducts an in-depth analysis of scheduling and load balancing on such workflows. In this context, this chapter proposes an analysis of scheduling and load balancing from the perspective of data science scenarios. It presents concepts, environments, and tools to summarize the theoretical background required to define, assign, and execute data science workflows. Furthermore, we are also presenting new trends concerning the intersection of data science, scheduling, and load balance.

Part of the book: Scheduling Problems

Cristiano Costa

Chapters authored

Related collaborators

Larysa Globa

Hong Seong Park

Alexander Koval

Tahani Aladwani

Ade Jamal

Liliana Grigoriu

Nataliia Gvozdetska

Volodymyr Prokopets

Surya Teja Marella

Thummuru Gunasekhar