Publication
ICME 2004
Conference paper

Detecting discussion scenes in instructional videos

Abstract

This paper addresses the problem of detecting discussion scenes in instructional videos using statistical approaches. Specifically, given a series of speech segments separated from the audio tracks of educational videos, we first model the instructor using a Gaussian mixture model (GMM), then a four-state transition machine is designed to extract discussion scenes in real-time based on detected instructor-student speaker change points. Meanwhile, we keep updating the GMM model to accommodate the instructor's voice variation along time. Promising experimental results have been achieved on five educational (IBM MicroMBA program) videos, and very interesting instruction/teaching patterns have been observed. The extracted scene information would facilitate the semantic indexing and structuralization of instructional video content.

Date

Publication

ICME 2004

Authors

Share