About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
VLDB 2017
Conference paper
SystemER: A humanintheloop system for explainable entity resolution
Abstract
Entity Resolution (ER) is the task of identifying different representations of the same real-world object. To achieve scalability and the desired level of quality, the typical ER pipeline includes multiple steps that may involve low-level coding and extensive human labor. We present SystemER, a tool for learning explainable ER models that reduces the human labor all throughout the stages of the ER pipeline. SystemER achieves explainability by learning rules that not only perform a given ER task but are human-comprehensible; this provides transparency into the learning process, and further enables verification and customization of the learned model by the domain experts. By leveraging a human in the loop and active learning, SystemER also ensures that a small number of labeled examples is sufficient to learn high-quality ER models. SystemER is a fulledged tool that includes an easy to use interface, support for both flat files and semi-structured data, and scale-out capabilities by distributing computation via Apache Spark.