Speaker Recognition using Gaussian Mixture Models
Abstract
Gaussian Mixture models is one of the most popular statistical methods in speaker recognition. The purpose of this research is to perform experiments of speaker recognition using various feature vectors: four formants, four formants with fundamental frequency and mel cepstrum coefficients. Gaussian mixture models using mel cepstrum coefficients is baseline in speaker recognition and gives one of the best results in text independent speaker recognition. After implementing experiments of speaker recognition and comparing experimental results we can affirm that mel scale cepstral coefficients and four formants with fundamental frequency gives quite the same recognition accuracy, but creating of Gaussian mixture speaker models and recognition process continues a few times longer using mel scale cepstral coefficients, because count of calculations is few times greater in that case. Using only four formants gives the worst results of recognition accuracy. Ill. 7, bibl. 12 (in English; summaries in English, Russian and Lithuanian).
Downloads
Published
How to Cite
Issue
Section
License
The copyright for the paper in this journal is retained by the author(s) with the first publication right granted to the journal. The authors agree to the Creative Commons Attribution 4.0 (CC BY 4.0) agreement under which the paper in the Journal is licensed.
By virtue of their appearance in this open access journal, papers are free to use with proper attribution in educational and other non-commercial settings with an acknowledgement of the initial publication in the journal.