About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ASSETS 2018
Conference paper
Leveraging pauses to improve video captions
Abstract
Currently, video sites that offer automatic speech recognition display the auto-generated captions as arbitrarily segmented lines of unpunctuated text. This method of displaying captions can be detrimental to meaning, especially for deaf users who rely almost exclusively on captions. However, the captions can be made more readable by automatically detecting pauses in speech and using the pause duration as a determinant both for inserting simple punctuation and for more meaningfully segmenting and timing the display of lines of captions. A small sampling of users suggests that such adaptations to caption display are preferred by a majority of users, whether they are deaf, hard of hearing or hearing.