Automatic Segmetation of Phonemes using Artificial Neural Networks
Abstract
Automatic segmentation of phonemes is often used in speech technology. The purpose of this research is to find how the perceptron and back-propagation artificial neural networks (that can assimilate linear and non-linear connection of the pattern) distinguish between different phonemes, using various features of the speech signal used in speech or speaker recognition tasks: coefficients of linear prediction coding (LPC), cepstral coefficients, and coefficients of the Fourier transform (energy density spectrum). Artificial neural networks can be used for setting the start and end points of the word, too. They can separate not only voiced frames of the signal from noise, but also non-voiced, whose spectrum and that of noise are similar. Experiments were carried out and we can affirm that in order to segment the phonemes all the feature vectors used are suitable. However, if we want to separate different phonemes out of noise by automatically setting the start and end points of the word, the coefficients of the Fourier transform are most suitable, meanwhile cepstral coefficients do not fit. Ill. 8, bibl. 7 (in Lithuanian; summaries in English, Russian and Lithuanian).
Downloads
Published
How to Cite
Issue
Section
License
The copyright for the paper in this journal is retained by the author(s) with the first publication right granted to the journal. The authors agree to the Creative Commons Attribution 4.0 (CC BY 4.0) agreement under which the paper in the Journal is licensed.
By virtue of their appearance in this open access journal, papers are free to use with proper attribution in educational and other non-commercial settings with an acknowledgement of the initial publication in the journal.