Search for Keywords and Vocal Elements in Audio Recordings
AbstractThis paper deals with search for keywords and non-verbal vocal elements in audio recordings. An efficient detection of specific words or sounds embedded in continuous speech is based on isolated word recognition approaches. The mel-frequency cepstral coefficients and more combinations of predictive coefficients and autocorrelation coefficients were evaluated. A keyword or key sound slides along the stored speech and in each of its positions a distance (i.e., similarity) to the corresponding speech segment is computed. We found an efficient distance measure for non-verbal sound search. The average detection rates achieved 93 percent in keyword search and 74 percent in non-verbal sound search. A system developed for automatic search in audio files is presented.
Authors retain copyright and grant the journal the right of the first publication with the paper simultaneously licensed under the Creative Commons Attribution 4.0 (CC BY 4.0) licence.
Authors are allowed to enter into separate, additional contractual arrangements for the non-exclusive distribution of the paper published in the journal with an acknowledgement of the initial publication in the journal.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.