Comparison of Linear Discriminant Analysis Approaches in Automatic Speech Recognition
AbstractSpeech recognition systems are commonly modelled by hidden Markov models with Gaussian mixture models as observation density functions. These models have a significant number of parameters, which usually leads to the problem of data sparsity, especially for under-resourced languages such as Serbian. One of the ways to overcome the problem of data sparsity is the reduction of the number of features. Linear discriminant analysis (LDA) and heteroscedastic LDA (HLDA) are two common ways to reduce the dimensionality in an automatic speech recognition task. The paper compares the properties of speech recognition systems for Serbian in which both techniques are applied with variable types of input features as well as the number of output features of (H)LDA. The best results are obtained in the case of HLDA with input vectors consisting of concatenations of feature vectors across 7 successive frames, where each feature vector contains 12 mel frequency cepstral coefficients (MFCCs) and normalized energy, and the number of output features is 32 or 35.
Authors retain copyright and grant the journal the right of the first publication with the paper simultaneously licensed under the Creative Commons Attribution 4.0 (CC BY 4.0) licence.
Authors are allowed to enter into separate, additional contractual arrangements for the non-exclusive distribution of the paper published in the journal with an acknowledgement of the initial publication in the journal.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.