Workshop paper

Task-driven Sensing with Coarse-to-Fine Glimpse-based Active Perception

Abstract

Modern computer vision models commonly rely on passive sensing and process images in their entirety all at once. Lacking the ability to zoom-in to task-relevant regions for detailed analysis, this approach becomes limited for high-resolution, cluttered scenes where only a small area is relevant for the task at hand. A particularly challenging problem in this context is instance detection that involves localizing specific object instances given a few visual examples. We introduce an active sensing system that uses a brain-inspired coarse-to-fine strategy to glimpse over the image by steering a retina-like sensor. The sensor uses a log-polar pixel layout that facilitates precise localization of task-relevant regions.
Our system can be integrated with various state-of-the-art instance detectors. It improves their performance by up to 90%, making even small models developed for edge-devices perform on par or, in difficult cases, even better than their large counterparts. In light of performance gains, our model can become a complementary part in sensor hardware enabling active, task-driven sensing.