Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A
Many techniques for retrieving arbitrary content from, audio have been developed to leverage the important challenge of providing fast access to very large volumes of multimedia data. We present a two-stage method for fast audio search, where a vector-space modelling approach is first used to retrieve a short list of candidate audio segments for a query. The list of candidate segments is then searched using a word-based index for known words and a phone-based index for out-of-vocabulary words. We explore various system configurations and examine trade-offs between speed and accuracy. We evaluate our audio search system according to the NIST 2006 Spoken Term Detection evaluation initiative. We find that we can obtain a 30-times speedup for the search phase of our system with a 10% relative loss in accuracy. © 2007 IEEE.
Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Conrad Albrecht, Jannik Schneider, et al.
CVPR 2025
Miao Guo, Yong Tao Pei, et al.
WCITS 2011