Publication
MM 2001
Conference paper
Multimedia edges: Finding hierarchy in all dimensions
Abstract
This paper describes a new unified representation for the information in a video. We reduce the dimensionality of the signal with either a singular-value decomposition (on the semantic and image data) or mel-frequency cepstral coefficients (on the audio data) and then concatenate the vectors to form a multi-dimensional representation of the video. Using scale-space techniques we find large jumps in the video's path, which we call edges. We use these techniques to analyze the temporal properties of the audio and image data in a video. This analysis creates a hierarchical segmentation of the video, or a table-of-contents, from the audio, semantic and image data.