- Avihu Dekel
- Slava Shechtman
- et al.
- 2024
- ICASSP 2024
Conversational Text to Speech
Overview
The voice channel is a crucial element in customer-care scenarios, especially over the phone, and text-to-speech (TTS) systems play a fundamental role in establishing and maintaining a positive customer experience.
We are developing a low latency expressive text-to-speech intended for use in conversational voice agents for customer-care. By designing and recording a speech corpus with conversational content, expressive speaking styles, and interjections, and by employing innovative deep learning and data augmentation techniques, our conversational TTS system can produce human sounding expressive spoken machine responses in a variety of voices.
Furthermore, we have enabled the technology to synthesize expressive speech while text is being generated by a large language model (LLM), with only a minimal latency between text and speech generation. This makes it compatible with generative conversational AI systems.
Publications
- Claudio Santos Pinhanez
- Raul Fernandez
- et al.
- 2024
- IUI 2024
- Slava Shechtman
- Raul Fernandez
- 2023
- INTERSPEECH 2023
- Edmilson Da Silva Morais
- Matheus Costa Damasceno
- et al.
- 2023
- ICASSP 2023
- Raul Fernandez
- David Haws
- et al.
- 2022
- INTERSPEECH 2022
- Zvi Kons
- Hagai Aronowitz
- et al.
- 2022
- INTERSPEECH 2022
- Edmilson Morais
- Ron Hoory
- et al.
- 2022
- ICASSP 2022
- Hagai Aronowitz
- Itai Gat
- et al.
- 2022
- ICASSP 2022