Aalto University Schools of Technology - electronic academic dissertations - http://otalib.aalto.fi/fi/kokoelmat_tiedonhaku/e-julkaisut/vaitoskirjat/
Aalto

New Linear Predictive Methods for Digital Speech Processing

Susanna Varho

Dissertation for the degree Doctor of Philosophy to be presented with due permission for public examination and debate in Auditorium S4, Department of Electrical and Communications Engineering, Helsinki University of Technology (Espoo, Finland), on the 20th of April, 2001, at 12 o'clock noon.

Overview in PDF format (ISBN 951-22-5410-7)   [264 KB]
Dissertation is also available in print (ISBN 951-22-5409-3)

Abstract

Speech processing is needed whenever speech is to be compressed, synthesised or recognised by the means of electrical equipment. Different types of phones, multimedia equipment and interfaces to various electronic devices, all require digital speech processing. As an example, a GSM phone applies speech processing in its RPE-LTP encoder/decoder (ETSI, 1997). In this coder, 20 ms of speech is first analysed in the short-term prediction (STP) part, and second in the long-term prediction (LTP) part. Finally, speech compression is achieved in the RPE encoding part, where only 1/3 of the encoded samples are selected to be transmitted.

This thesis presents modifications for one of the most widely applied techniques in digital speech processing, namely linear prediction (LP). During recent decades linear prediction has played an important role in telecommunications and other areas related to speech compression and recognition. In linear prediction sample s(n) is predicted from its p previous samples by forming a linear combination of the p previous samples and by minimising the prediction error. This procedure in the time domain corresponds to modelling the spectral envelope of the speech spectrum in the frequency domain. The accuracy of the spectral envelope to the speech spectrum is strongly dependent on the order of the resulting all-pole filter. This, in turn, is usually related to the number of parameters required to define the model, and hence to be transmitted.

Our study presents new predictive methods, which are modified from conventional linear prediction by taking the previous samples for linear combination differently. This algorithmic development aims at new all-pole techniques, which could present speech spectra with fewer parameters.

This thesis consists of an overview and of the following 9 publications:

  1. S. Varho and P. Alku, A Linear Predictive Method Using Extrapolated Samples for Modelling of Voiced Speech, Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, Session IV, pp. 13-16 (1997).
  2. S. Varho and P. Alku, Separated Linear Prediction - A new all-pole modelling technique for speech analysis, Speech Communication, Volume 24, pp. 111-121 (1998).
  3. S. Varho and P. Alku, Regressive Linear Prediction with Triplets - An Effective All-Pole Modelling Technique for Speech Processing, Proceedings of IEEE International Symposium on Circuits and Systems, Monterey, CA, Volume IV, pp. 194-197 (1998).
  4. S. Varho and P. Alku, Spectral Estimation of Voiced Speech with Regressive Linear Prediction, Proceedings of IX European Signal Processing Conference, Rhodes, Greece, Volume II, pp. 1189-1192 (1998).
  5. P. Alku and S. Varho, A New Linear Predictive Method for Compression of Speech Signals, Proceedings of the 5th International Conference on Spoken Language Processing, Sydney, Australia, Volume VI, pp. 2563-2566 (1998).
  6. S. Varho and P. Alku, A New Predictive Method for All-Pole Modelling of Speech Spectra with a Compressed Set of Parameters, Proceedings of IEEE International Symposium on Circuits and Systems, Orlando, FL, Volume III, pp. 126-129 (1999).
  7. S. Varho and P. Alku, A Linear Predictive Method Highly Compressed Presentation of Speech Spectra, to be published in Proceedings of IEEE International Symposium on Circuits and Systems, Geneva, Switzerland, Volume V, pp. 57-60 (2000).
  8. S. Varho and P. Alku, Linear Prediction of Speech by Sample Grouping, Proceedings of IEEE Nordic Signal Processing Symposium, Kolmården, Sweden, pp. 113-116 (2000).
  9. S. Varho and P. Alku, Separated Linear Prediction - Improved Spectral Modelling by Sample Grouping, Proceedings of IEEE International Symposium on Intelligent Signal Processing and Communication Systems, Honolulu, HI, pp. 731-735 (2000).

Keywords: linear prediction, speech processing, speech analysis, spectral estimation

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2001 Helsinki University of Technology


Last update 2011-05-26