Introduction Power method Formant Trajectory Cepstrum method Result Conclusions Future work Bibliography 
LPC for Speech Recognition
LPC has been widely used in speech recognition systems. In this section we describe the method we implemented for recognition of numbers 1 to 5, using LPC cepstral coefficients. We followed the basic ideas proposed by Markel et al. [2], Papamichalis [5] and Rabiner [6]. Figure 1 shows a block diagram of the speech recognition system. The basic steps in the processing of each word are the following:
Figure 1: 
1. Preemphasis The speech signal (here, also refereed as {\it word}), s(n), is filtered with a firstorder FIR filter to spectrally flatten the signal. We used one of the most widely used preemphasis filter of the form

2. Normalization After preemphasis, each word has it’s energy normalized. Based on the energy distribution along the temporal axis, it is computed the center of gravity, and this information is used as reference for temporal alignment of the words. Appendix B shows examples of temporal alignment. The energy of each word was computed using 60 non overlapping windows. The program is in Appendix C. 
3. Frame Blocking The preemphasized speech signal, s^[n], is blocked into frames of N samples, with adjacent frames being separated by M samples. Table 1 gives the values used for N and M. If we denote the l:th frame of speech by xl[n], and there are L frames, then

4. Windowing Each individual frame is windowed to minimize the signal discontinuities at the borders of each frame. If the window is defined as w[n], 0 < n < N1, then the windowed signal is

5. LPC Parameters The next processing step is the LPC analysis using the autocorrelation method of order p. In matrix form,

6. LPC Parameter Conversion to Cepstral Coefficients The LPC cepstral coefficients, c_m, are a very important LPC parameter used in speech recognition. They can be derived directly from the set of LPC coefficients a_i for i=1,…,p, using the recursion

7. Cepstral Distance The cepstral coefficients provide an efficient computation of the logspectral distance of two frames [5]. For LPC models that represent smoothed envelopes of the speech spectra, it is usually used a truncated number of cepstral coefficients. In our work we used a truncated cepstral distance [6] defined by 
8. Training and Classification In the last part, we build a codebook of cepstral coefficients. Each one of the five classes of words (numbers one to five) is represented by 58 vectors, each one with 15 coefficients. Each vector represents a frame of a class. One routine in Matlab is used to compute the average vector for each frame based on sets of 30 words for each class. The codebook is stored and used in the classification routine. The program used in the training stage is in Appendix E. The classification procedure for arbitrary spectral vectors is basically a full search through the codebook to find the `best’ match. A classification routine in Matlab, computes the cepstral coefficients of the unknown input word. After that, it computes the distance between each vector of the input word and the corresponding vector in the codebook. The input vector is classified with the number associated with the class that gives the minimum total distance. The classification program is in Appendix F. The program in Appendix G was used to play the matlab data files. 
Results
For the tests we used a training set consisting of 30 occurrences of each digit by 3 talkers (i.e., 10 occurrence of each digit per talker). All the talkers were male. The error rate, obtained using basically the same set, was less than 3% (more than 97% correct classifications). Table 2 gives the errors. The overall results are aslo in the Result section. 
Appendix
A. Preemphasized Signal 
B. Temporal Alignment 
C. Program – Normalization 
D. Windowed Signal 
E. Program – LPC Cepstral Coefficients 
F. Program – Classification 
G. Program – Auxiliar 
Source:
http://www.clear.rice.edu/elec532/PROJECTS98/speech/cepstrum/cepstrum.html
Virtual Fashion Education
"chúng tôi chỉ là tôi tớ của anh em, vì Đức Kitô" (2Cr 4,5b)
hienphap.net
News About Tech, Money and Innovation
Modern art using the GPU
Find the perfect theme for your blog.
Learn to Learn
Con tằm đến thác vẫn còn vương tơ
Khoa Vật lý, Đại học Sư phạm Tp.HCM  ĐT :(08)38352020  109
Blog Toán Cao Cấp (M4Ps)
Indulge Travel, Adventure, & New Experiences
"Behind every stack of books there is a flood of knowledge."
The latest news on WordPress.com and the WordPress community.