Advanced Recognition of Lithuanian Digit Names Using Hybrid Approach
The paper deals with the recognition of digits and with the hybrid recognition technology. By the hybrid approach, we assume the combination of two or more different recognizers have to achieve higher recognition accuracy. Two Lithuanian recognizers using the word based and phoneme-based hidden Markov models (HMM) together with the Spanish language recognizer 8.0 (Spanish-US) and Microsoft Speech Server Spanish language recognizer 9.0 (Spanish-US) were investigated. Using data mining package Weka, classification research was carried out with five different recognizer combining scenarios. The results of connecting two or three recognizers showed that the suggested method of using machine learning method to connect different recognizers greatly improved the recognition accuracy of digits speech corpus in all five cases. Manual annotation of the part of speech corpus enables to increase the recognition accuracy of Lithuanian digits names about 40 % using sub-word-based recognizer. SAMPA_LT set of phonemes is redundant for the digits recognition.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.