Heterogeneous cross domain ranking in latent space
Abstract
Traditional ranking mainly focuses on one type of data source, and effective modeling still relies on a sufficiently large number of labeled or supervised examples. However, in many real-world applications, in particular with the rapid growth of the Web 2.0, ranking over multiple interrelated (heterogeneous) domains becomes a common situation, where in some domains we may have a large amount of training data while in some other domains we can only collect very little. One important question is: "if there is not sufficient supervision in the domain of interest, how could one borrow labeled information from a related but heterogenous domain to build an accurate model?". This paper explores such an approach by bridging two heterogeneous domains via the latent space. We propose a regularized framework to simultaneously minimize two loss functions corresponding to two related but different information sources, by mapping each domain onto a "shared latent space", capturing similar and transferable oncepts. We solve this problem by optimizing the convex upper bound of the non-continuous loss function and derive its generalization bound. Experimental results on three different genres of data sets demonstrate the effectiveness of the proposed approach. Copyright 2009 ACM.