HKPoly: A Polystore Architecture to Support Data Linkage and Queries on Distributed and Heterogeneous Data
Abstract
Context: Modern information systems commonly manipulate heterogeneous data and schemas fragmented in the data stores that best fit their storage and access requirements. Besides, different organizations’ business processes independently consume these fragments without explicit links between the employed data. Problem: Supporting heterogeneous and not explicitly connected data residing in distinct data repositories is a big challenge. Solution: This work proposes HKPoly: a federated architecture that encapsulates data heterogeneity, location, and linkage. IS Theory: We employed the Representation theory to create the models of the architecture and its components. Method: Architecture implementation, its application in an Oil & Gas scenario, and its comparison to a multi-database system. Results: The proposal allows query writing to be two times less complex than the one written for the relational multi-database system, adding an excess of about 30% in query processing time. Contributions: An architecture to query heterogeneous data, the requirements and components for its implementation, and an implementation example using the stated-of-the-art concepts.