Conference paper

Cultural voice markers in speech-to-speech machine translation systems


Current implementations of real-time speech-to-speech (S2S) translation systems for intercultural collaboration have mainly focused on the accuracy of the recognition and translated content. Typically, the translated utterance is presented to users through text-to-speech (TTS), without projecting cultural nuances in the tone of voice. This study investigates whether there are cross-cultural markers of variations in voice dynamics, and, if these have any impact on user satisfaction. Based on subjective user evaluations (Chinese and English), we conclude that there are salient cross-cultural voice markers relevant to the interaction of culture and system design; with noticeable impact on user satisfaction in TTS and S2S systems.
