Publication
CAiSE 2023
Conference paper

TEADAL: Trustworthy, Energy-Aware federated DAta Lakes along the computing continuum

Abstract

While the value that data analytics has in any organization is undoubted and widely recognized, most approaches and solutions are mainly based on processing data that are internally produced or, if available from the outside, are publicly available with almost no restrictions. The situation becomes, however, more challenging when data analytics can take advantage of data owned by other organizations, which are only available to share under some conditions that must not affect the value that such data has for them. In this context, the need for data sharing has to deal with the need for data sovereignty, i.e., the possibility of an organization to exert full control over their data. This introduces barriers at both organizational and technological level: data must be shared under an agreement established among the parties which could affect the locations – on premise or on cloud – where the storage and processing of the data are performed. The goal of the TEADAL project is to propose and provide a set of tools to enable the creation of a federation of data lakes, particularly reducing the barriers in defining the terms under which data can be shared. The actual sharing shall optimize the exploitation of resources owned by the members to foster efficient, energy-aware, and trusted data management.