ADnEV: Cross-Domain Schema Matching using Deep Similarity Matrix Adjustment and Evaluation
Abstract
Schema matching is a process that serves in integrating structured and semi-structured data. Being a handy tool in multiple contemporary business and commerce applications, it has been investigated in the fields of databases, AI, Semantic Web, and data mining for many years. The core challenge still remains the ability to create quality algorithmic matchers, automatic tools for identifying correspondences among data concepts (e.g., database attributes). In this work, we offer a novel post processing step to schema matching that improves the final matching outcome without human intervention. We present a new mechanism, similarity matrix adjustment, to calibrate a matching result and propose an algorithm (dubbed ADnEV) that manipulates, using deep neural networks, similarity matrices, created by state-of-the-art algorithmic matchers. ADnEV learns two models that iteratively adjust and evaluate the original similarity matrix. We empirically demonstrate the effectiveness of the proposed algorithmic solution for improving matching results, using real-world benchmark ontology and schema sets. We show that ADnEV can generalize into new domains without the need to learn the domain terminology, thus allowing cross-domain learning. We also show ADnEV to be a powerful tool in handling schemata which matching is particularly challenging. Finally, we show the benefit of using ADnEV in a related integration task of ontology alignment.