Customer clustering using semi-supervised geographic information
Abstract
We present an innovative approach for clustering retail customers using semi-supervised geographic information. The approach aims at clustering (or segmenting) customers not only depending on their age, spending, etc., but also on their dwelling, which can discover useful customer patterns for the retailer's marketing strategy. In real retail applications, unsupervised clustering faces the problem of normalizing multiple heterogeneous features, which results in limited findings. Moreover, human knowledge can not be incorporated in the process. Consequently, we propose a semi-supervised approach which supports two kinds of human knowledge on the clustering: 1) hard constraint - "must-link"and "cannot-link" and 2) soft constraint - distance comparison. The constraints can be appropriately applied in our task of customer clustering. Based on the constraints, we develop a framework integrating metric learning (by weighing features) and clustering. The experimental results on real customer profile, comparing with the unsupervised approach, show reasonable clusters. In addition, using the proposed approach, the learned feature weights reveal valuable knowledge on the customers. ©2009 IEEE.