Large-scale nonparametric estimation of vehicle travel time distributions
Abstract
Fitting distributions of travel-time in vehicle traffic is an important application of spatio-temporal data mining. While regression methods to forecast the expected travel-time are standard approaches of travel-time prediction, we need to estimate distributions of the travel-time when using stateof- The-art risk-sensitive route recommendation systems. The authors introduce a novel nonparametric density estimator of travel-time for each road or link. The new estimator consists of basis functions modeled as mixtures of gamma or log-normal density functions, a sparse link similarity matrix given as an approximate diffusion kernel on a link connectivity graph, and importance weights for each link. Unlike the existing nonparametric methods that are computationally intensive, the new estimator is stably applicable to large datasets, because the basis functions and the importance weights are globally optimized with a fast convex clustering algorithm. Experimental results using real probe-car datasets show advantages of the new nonparametric estimator over parametric regression methods. Copyright © 2012 by the Society for Industrial and Applied Mathematics.