About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Conference paper
Recognizing hand-object interactions in wearable camera videos
Abstract
Wearable computing technologies are advancing rapidly and enabling users to easily record daily activities for applications such as life-logging or health monitoring. Recognizing hand and object interactions in these videos will help broaden application domains, but recognizing such interactions automatically remains a difficult task. Activity recognition from the first-person point-of-view is difficult because the video includes constant motion, cluttered backgrounds, and sudden changes of scenery. Recognizing hand-related activities is particularly challenging due to the many temporal and spatial variations induced by hand interactions. We present a novel approach to recognize hand-object interactions by extracting both local motion features representing the subtle movements of the hands and global hand shape features to capture grasp types. We validate our approach on multiple egocentric action datasets and show that state-of-the-art performance can be achieved by considering both local motion and global appearance information.