Linking identical neighborly partitions for efficient high-dimensional similarity search in unstructured peer-to-peer systems
Abstract
Peer-to-Peer (P2P) computing has recently attracted a great deal of research attention. In a P2P system, a large number of nodes can potentially be pooled together to share their resources, information, and services. However, existing unstructured P2P systems lack support for content-based search over data objects which are generally represented by high-dimensional feature vectors. In this paper, we propose an efficient and effective indexing mechanism to facilitate high-dimensional similarity query in unstructured P2P systems, named Linking Identical Neighborly Partitions (LINP), which combines both space partitioning technique and routing index technique. With the aid of LINP, each peer can not only process similarity query efficiently over its local data, but also can route the query to the promising peers which may contain the desired data. In the proposed scheme, each peer summarizes its local data using the space partitioning technique, and exchanges the summarized index with its neighboring peers to construct routing indices. Furthermore, to improve the system performance with peer updates, we propose an extension of the LINP, named LINP+, where each peer can reconfigure its neighboring peers to keep relevant peers nearby. The performance of our proposed scheme is evaluated over both synthetic and real-life high-dimensional datasets, and experimental results show the superiority of our proposed scheme.