About
Bluetalks are vibrant gatherings with in-depth discussions on technology and innovation, showcasing the latest research projects from IBM Research Brazil. Expand your horizons with disruptive ideas and new perspectives on cutting-edge topics. Join us to explore our innovative projects even further.
Speakers
Claudio Santos Pinhanez
Daniel Civitarese
Daniela Szwarcman
Leonardo Azevedo
Maciel Zortea
Maysa Macedo
Agenda
Modern information systems typically use data with heterogeneous models and data schemas. For example, a smart transportation system uses data generated from various sources, such as mobile devices, airborne sensing systems, traffic cameras, microphones, RFID readers, etc.
A single database for storing distinct data doesn't work. Running ETL (Extract-Transform-Load) processes, manual curation, and maintenance to have a single database is costly and requires significant effort. Therefore, data from these scenarios generally resides in the most appropriate data storage systems for their storage and access, such as relational databases, NoSQL, HDFS, processing frameworks, hybrid multimodal databases, or hybrid NewSQL systems.
The objective of this presentation is to introduce the main concepts about heterogeneous data storage systems, illustrating a solution in a use case scenario in the Oil and Gas area. The presentation includes the following topics: characterizing existing storage solution classes, presenting examples of systems; presenting a taxonomy for federated data systems, their requirements, and challenges; illustrating an implementation in a use case scenario.
The use case scenario includes activities for pre-processing geological data to generate data for training and validating Deep Learning (DL) models in the Oil and Gas area. The solution for this scenario will be illustrated using the PostgresSQL FDW (Foreign Data Wrapper). This solution allows creating tables in PostgresSQL that bring externally stored data from heterogeneous data storage systems.
About the speaker: Leonardo G. Azevedo has been Research Scientist within IBM Research Brazil since 2013. He is Ph.D. (2005) and MSc. (2001) from PESC/COPPE/UFRJ and Bachelor in Informatics from UFRJ (Rio de Janeiro-Brazil). He was a professor within UniRio (2006 to 2018) and a researcher of the Graduate Program in Informatics (PPGI/UniRio) (2009 to 2018). He has more than 20 years of experience in system development and applied research, working on projects for national and international organizations. His research areas are Distributed Sytems, Service-Oriented Architecture (SOA), Microservices Architecture (MSA), Databases, Provenance, Data Integration, Polystore, Knowledge Engineering, Ontologies, and Business Process Management (BPM).
LALeonardo AzevedoStaff Research Scientist, Knowledge Engineer and Distributed Service ArchitectIBM Research