Leveraging pauses to improve video captions

Michael Gower; Brent Shiver; Charu Pandhi; Shari Trewin

doi:10.1145/3234695.3241023

ASSETS 2018

Conference paper

08 Oct 2018

Leveraging pauses to improve video captions

View publication

Abstract

Currently, video sites that offer automatic speech recognition display the auto-generated captions as arbitrarily segmented lines of unpunctuated text. This method of displaying captions can be detrimental to meaning, especially for deaf users who rely almost exclusively on captions. However, the captions can be made more readable by automatically detecting pauses in speech and using the pause duration as a determinant both for inserting simple punctuation and for more meaningfully segmenting and timing the display of lines of captions. A small sampling of users suggests that such adaptations to caption display are preferred by a majority of users, whether they are deaf, hard of hearing or hearing.

Conference paper